But why would someone want to do this? It feels like mental overhead. So what advantages are there to a system like this?
You never have to reset state if you do not have it in the first place. You get other bugs still, but you don't have the most common class of bugs.
But, you need some state. You just want to discourage it. Some of it can come from laziness, but with difficulty. (Chris Okasaki’s Purely Functional Data Structures uses it to de-amortize the bound on a FIFO queue.) Other things, like ring buffers, are harder to argue.
So you want to be able to express an array of IORefs, say, for your ring buffer. But the people who use it become aware that it is a stateful construct and must be used that way.
Read “what color is your function?” for a counterpoint, of course.
a = f()
b = f()
c = g(a, b)
into a = f()
c = g(a, a)
because f() is guaranteed not to have side effects. (If it does return side effects through the IO monad, instead of actually performing the side effects, g receives two data structures that encode what IO operations to perform.) In an imperative/procedural language, f() may have side effects, and if it does the two programs are not equivalent.This may (not) make programs easier to reason about.