This seems like an area where a mutable API beats an immutable API. It's very easy to accidentally reuse old states when doing functional programming. On the other hand, if you need a way to serialize the RNG, (to send it to another task, perhaps) then the risk of reusing states comes back again.
The counter-based RNG I was looking at is an immutable API. It makes it easy to initialize generators in separate tasks by feeding each one a unique ID, but figuring out how to do a split operation is harder. Conceptually, the way to go is to multiply the key by 2 and add either 0 or 1 for each branch, to give them unique IDs. (This is called a pedigree.) However, for unbalanced trees, you will run out of bits. Figuring out how to do deal with that seems tricky?
The LXM generator paper (non-crypto) looks like good reading:
https://dl.acm.org/doi/pdf/10.1145/3485525
They initialize the new child with the next random values of the parent. There is an extra parameter that they use to add more input bits, to reduce the probability of overlap.