Ok, I'll try to explain it since you replied to me, even though I didn't downvote you.
The purpose of XMM buffers is for SIMD instructions, which implies that the buffers store multiple atoms and does the same instruction on each atom in the buffer at once.
This interesting bit of this little hack is how it breaks that expected data level parallelism and instead provides a (relatively) high-level interface for the new usage. It has nothing to do with shuffling data to and from those registers.
You're getting downvoted because your initial comment ignored the bit that used the registers in a way that wasn't intended and focused on the bits that were the same as the intended usage.
And then you're being condescending about the points you're trying to make. You might not have intended to be condescending, but you are being condescending nonetheless.