I think you should give the guy above the benefit of the doubt and restrict yourself to ST. IO and STM are different beasts, and still they're not imperative. The issue with IO and STM is global state. Mutation of global state can be done imperatively, as many imperative languages do. It can also be done functionally using monads to hide the affine types in a nice wrapper class in the absence of proper affine types or using proper affine type systems. Haskell choose the former. Other languages, like clean, choose the latter.
Same input -> same output is a nice property of referential integrity, which is a common thing we have in functional languages, but it is actually not due to the functional nature. Imperative languages can trivially be made to have this property by disallowing mutation of variables not in the immediate scope and restrict IO side effects (and maybe a few other straightforwards restrictions depending on the language in question).
With that out of the way. Same input -> same output holds for `orElse :: STM a -> STM a -> STM a`. However, you really have to expand out the types.
Here is the definition of STM (from https://hackage.haskell.org/package/base-4.15.0.0/docs/GHC-C...):
> newtype STM = STM (State# RealWorld -> (# State# RealWorld, a #))
So, `orElse` should behave the same as a function with this type:
> (State# RealWorld -> (# State# RealWorld, a #)) -> (State# RealWorld -> (# State# RealWorld, a #)) -> State# RealWorld -> (# State# RealWorld, a #)
Now, the definition of 'same input -> same output' is clear. Assuming the global state (State# RealWorld) is the same, then the output state will be the same. Actually, this is not a property of all languages, as most languages allow arbitrary side effects in any code. Haskell does not. This is the key property that makes STM easy in Haskell and difficult in other languages. Note that the 'State# RealWorld' is always handled linearly, so there is no possible way the same 'value' could be passed in twice. Again, hiding this behind a monad enforces that this value is handled linearly, since Haskell doesn't have linear types (or at least didn't when STM, ST, and IO were written).