undefined | Better HN

0 pointsKMag5y ago0 comments

Very nice. I hope that in the future, native code will be the distribution format about as often as we write inline assembly. Sometimes you'll really need precise control over instructions, but for 99% of code, it's not worth the cost in giving up long-term technology improvements.

In particular, x86's total store ordering memory model causes some memory fences to disappear at the machine code level. The Aarch64 relaxed memory model allows for lower cache synchronization overhead, but code with correct memory fences compiled to x86 loses this information, requiring overly conservative binary translation/higher overhead TSO mode in Aarch64 binary translators. These days, hardware acquire/release/full flavors of memory fences better match the C++ and Java memory models, but some hardware has load/store/full flavors of memory fences. Binary translation across these flavors means changing all fences to full fences, or else some static analysis that's far beyond anything I'm aware existing at this time.

0 comments

2 comments · 1 top-level

saagarjha5y ago· 1 in thread

Or just specialized hardware to do the TSO…

KMagOP5y ago

One could, but then one would give up the advantages of a relaxed memory model.

j / k navigate · click thread line to collapse