Right, but all architectures can handle many combinations of instructions in 1 cycle, so this is not really a great proxy for that.
Same for code size. If the instructions are half the size, having 1.5x more instructions still means smaller binaries.