It's more than that. In RISC-V, you only need the first two bits of each instruction to determine whether it's a 16 bit or 32 bit instruction; you don't need to decode an instruction to know its length.
> [...] we see the cache sizes M1/M2 need just to deal with this, [...]
Do the M1/M2 need these cache sizes, or do they have these cache sizes because they can have these cache sizes, due to having a 4x larger page size by default? (Normally, page size wouldn't be that much of a problem for instruction caches, but for x86 it is because the x86 ISAs don't require explicit instruction cache invalidation on self-modifying code; x86 processors would likely have larger L1 instruction cache sizes if they could get away with it.)