> but an X86 instruction is not just "an instruction" anymore.
This is technically true but not really. Decoding into many instructions is mainly used for compatibility with the crufty parts of the x86 spec. In general, for anything other than rmw or locking a competent compiler or assembly writer will only very rarely emit instructions that compile to more than one uop. The way the frontend works, microcoded instructions are extraordinarily slow on real cpus.
Modern x86 is basically a risc with a very complex decode, few extra useful complex operations tacked on, and piles and piles of old moldy cruft that no-one should ever touch.