> This hasn't been dependably true on x86 for almost two decades. SSE2 does double-precision computation at native width, not in the extended 80-bit format. Some 32b compilers still use 80-bit x87, but almost no 64b compilers do so.
My experience has been different -- forcing SSE instructions gives me a different result on some math calculations. Core2 cpu, boost odeint calculations. Clang or gcc.
Do you have a reference for why it's rare?