undefined | Better HN

0 pointsbeagle34y ago0 comments

Don’t you have to add an INTO (interrupt on overflow) to trigger that? I don’t remember there’s a mode where arithmetic overflows trap automatically.

(And if that’s the case, by now INTO will be a slow microcode implementation, with JO _trigger_into probably being much faster)

0 comments

4 comments · 1 top-level

adrian_b4y ago· 3 in thread

Unfortunately INTO has been deprecated by AMD 20 years ago, with the passing to 64-bit.

No Intel/AMD CPU supports INTO in 64-bit mode.

It is wise to compile any C/C++ program with options like:

"-fsanitize=undefined -fsanitize-undefined-trap-on-error"

(as for gcc) or whatever they are called by the used compiler.

In that case, the compiler will add instructions with the same effect as INTO, but with more overhead than it was needed in 32-bit Intel/AMD CPUs.

gpderetta4y ago

are you sure it is more overhead? Sanitize overflow adds a single jo in the critical path. INTO has been microcoded for decades in 32 bits.

adrian_b4y ago

The guaranteed overhead is in size, because JO has either 1 or 5 extra bytes per overflow check, compared to INTO. If you want the exception place to be known precisely, the program counter must be saved separately for each check, which adds a few extra bytes per check (e.g. 5 extra bytes for a CALL instruction, which allows the use of a short JO, so there are 6 extra bytes per check in total). Because there are many checks in a program, the extra program size is non-negligible.

The execution time should be about the same for INTO and JO, both for the normal case, when both INTO and JO are ignored and for the error case, when JO is guaranteed to be mispredicted, so its long execution time added to the time needed to fetch and execute the following instructions required to match the effect of an INTO (e.g. saving state and a second not-predicted jump or call that might be needed to reach the actual exception handler) will also be about the same as what would have been needed by an INTO exception, if not longer.

beagle3OP4y ago

If this was really a concern, it would have likely not been deprecated in the move to 64-bit, and would have had support in other architectures as well.

And if you're really worried about size, "JNO 1f; INT xy; 1f:" 'with xy having to be system wide and well known, but it would only take 3 extra bytes (or just 2, if you HLT and the OS knows what JO/HLT combo means). The branch predictor will likely predict this as a no-take in any tight loop.

1 more reply

j / k navigate · click thread line to collapse