undefined | Better HN

0 pointsraphlinus3y ago0 comments

The actual rules are very complicated. C allows greater precision for intermediate results but compilers are sometimes careful to stick to IEEE rounding. [1] contains a good general overview, and [2] talks about FMA in particular. And in [3] I've set up a Godbolt example to play with. By default -O3 gives you FMA, but -O or -O3 with -ffp-contract=off don't. So you absolutely can get different results depending on optimization levels.

[1]: https://randomascii.wordpress.com/2012/03/21/intermediate-fl...

[2]: https://kristerw.github.io/2021/11/09/fp-contract/

[3]: https://godbolt.org/z/eTz8o6b3P

0 comments

mgaunard3y ago

The rule is very simple, I'm not seeing anything in what you say suggesting that it isn't?

raphlinusOP3y ago

Perhaps the rule in the standard is simple - the compiler can arbitrarily round to finer precision than IEEE, but in practice it's complicated as the same code can behave quite differently depending on what chip it's compiled for, the level of optimizations, and other factors. If you want to control it, ie model it as something other than nondeterminism, figuring out the right combination of compiler flags and so on is tricky.

I'll also point out that fma is relatively new, so it's pretty easy to write code that works fine when compiled with default x86_64/SSE2 but will break when compiled for a more recent target cpu.

mgaunard3y ago

All you're doing is listing simple consequences of the rule.

The compiler may use increased precision for intermediate computations. That means sometimes it will, sometimes it won't. If you understand the basics of the situations where it will do so, you can see it depends on register allocation, which of course not only depends on optimization level, but also can change anytime you change anything at all in the source code.

j / k navigate · click thread line to collapse

0 pointsraphlinus3y ago0 comments

[1]: https://randomascii.wordpress.com/2012/03/21/intermediate-fl...

[2]: https://kristerw.github.io/2021/11/09/fp-contract/

[3]: https://godbolt.org/z/eTz8o6b3P

0 comments

mgaunard3y ago

The rule is very simple, I'm not seeing anything in what you say suggesting that it isn't?

raphlinusOP3y ago

I'll also point out that fma is relatively new, so it's pretty easy to write code that works fine when compiled with default x86_64/SSE2 but will break when compiled for a more recent target cpu.

mgaunard3y ago

All you're doing is listing simple consequences of the rule.

j / k navigate · click thread line to collapse