> I understand a math library person caring, I don't think I care.
Not wasting much sleep on this one. Not sure there's anything on the spec that stops implementations from recognizing the two instructions and fuse them into a single atomic operation for the backends to deal with. It'll occupy more space in the L1 cache, but that's it.