Reduced I-Cache, uop cache, and decoder pressure would also have a beneficial impact. On the flip side, APX instructions would all be an entire byte longer than their AMD64 counterparts, so some of the benefits would be more muted than they might first appear and optimizing between 16 registers and shorter instructions vs 32 registers with longer instructions is yet another tradeoff for compilers to make (and takes another step down the path of being completely unoptimizable by humans).