undefined | Better HN

0 pointskragen10mo ago0 comments

Thank you! I guess that, as long as the branch instruction itself can't modify any of the state that would cause it to branch or not, that's a perfectly valid solution. It seems like load delay slots would be more troublesome; I wonder how the MIPS R2000 and R3000 handled that? (I'm not sure the Tera supported virtual memory.)

0 comments

garaetjjte10mo ago

Load delay slots doesn't seem to need special fault handling support, you're not supposed to depend on old value being there in the delay slot.

One more thing about branch delay slots: It seems original SuperH went for very minimal solution. It prevents interrupts being taken between branch and delay slot, and not much else. PC-relative accesses are relative to the branch target, and faults are also reported with branch target address. As far I can see this makes faults in branch delay slots unrecoverable. In SH-3 they patched that by reporting faults in delay slots for taken branches with branch address itself, so things can be fixed up in the fault handler.

kragenOP10mo ago

Hmm, I guess that if the load instruction doesn't change anything except the destination register (unlike, for example, postincrement addressing modes) and the delay-slot instruction also can't do anything that would change the effective address being loaded from before it faulted (and can't depend on the old value), then you're right that it wouldn't need any special fault handling support. I'd never tried to think this through before, but it makes sense. I appreciate it.

As for SH2, ouch! So SH2 got pretty badly screwed by delay slots, eh?

garaetjjte10mo ago

Even without faults, some SO answers indicate that on R2000 new value might be available in delay slot if it was a cache miss.

As for SuperH I don't think they cared too much. Primary use of handling faults is memory paging, and MMU was added only in SH-3, so that's probably the reason they also fixed delay slot fault recovery. Before that faults were either illegal opcodes or alignment violations, probably the answer for that was "don't do that".

kragenOP10mo ago

The new value was available earlier if it was a cache miss?

I didn't remember that the SH2 didn't support virtual memory (perhaps because I've never used SuperH). That makes sense, then.

I think that, for the ways people most commonly use CPUs, it's acceptable if the value you read from a register in a load delay slot is nondeterministic, for example depending on whether you resumed from a page fault or not, or whether you had a cache miss or not. It could really impede debugging if it happened in practice, and it could impede reverse-engineering of malware, but I believe that such things are actually relatively common. (IIRC you could detect the difference between an 8086 and an 8088 by modifying the next instruction in the program, which would have been already loaded by the 8086 but not the 8088. But I'm guessing that under a single-stepping debugger the 8086 would act like an 8088 in this case.) The solution would probably be "Stop lifting your arm like that if it hurts;" it's easy enough to not emit the offending instruction sequences from your compiler in this case.

The case where people really worry about nondeterminism is where it exposes information in a security-violating way, as in Spectre, which isn't even nondeterminism at the register-contents level, just the timing level.

Myself, I have a strong preference for strongly deterministic CPU semantics, and I've been working on a portable strongly deterministic (but not for timing) virtual machine for archival purposes. But clearly strong determinism isn't required for a usable CPU.

1 more reply

j / k navigate · click thread line to collapse

0 comments

garaetjjte10mo ago

Load delay slots doesn't seem to need special fault handling support, you're not supposed to depend on old value being there in the delay slot.

kragenOP10mo ago

As for SH2, ouch! So SH2 got pretty badly screwed by delay slots, eh?

garaetjjte10mo ago

Even without faults, some SO answers indicate that on R2000 new value might be available in delay slot if it was a cache miss.

kragenOP10mo ago

The new value was available earlier if it was a cache miss?

I didn't remember that the SH2 didn't support virtual memory (perhaps because I've never used SuperH). That makes sense, then.

1 more reply

j / k navigate · click thread line to collapse