The new value was available
earlier if it was a cache
miss?
I didn't remember that the SH2 didn't support virtual memory (perhaps because I've never used SuperH). That makes sense, then.
I think that, for the ways people most commonly use CPUs, it's acceptable if the value you read from a register in a load delay slot is nondeterministic, for example depending on whether you resumed from a page fault or not, or whether you had a cache miss or not. It could really impede debugging if it happened in practice, and it could impede reverse-engineering of malware, but I believe that such things are actually relatively common. (IIRC you could detect the difference between an 8086 and an 8088 by modifying the next instruction in the program, which would have been already loaded by the 8086 but not the 8088. But I'm guessing that under a single-stepping debugger the 8086 would act like an 8088 in this case.) The solution would probably be "Stop lifting your arm like that if it hurts;" it's easy enough to not emit the offending instruction sequences from your compiler in this case.
The case where people really worry about nondeterminism is where it exposes information in a security-violating way, as in Spectre, which isn't even nondeterminism at the register-contents level, just the timing level.
Myself, I have a strong preference for strongly deterministic CPU semantics, and I've been working on a portable strongly deterministic (but not for timing) virtual machine for archival purposes. But clearly strong determinism isn't required for a usable CPU.