Ignoring RISC-V’s compressed encoding seems a rather artificial restriction.
The "C" extension is technically optional, but I'm not aware of anyone who has made or sold a production chip without it -- generally only student projects or tiny cores for FPGAs running very simple programs don't have it.
My estimate is if you have even 200 to 300 instructions in your code it's cheaper to implement "C" than to build the extra SRAM/cache to hold the bigger code without it.
The compressed RISC-V encoding must be compared with the ARMv8-M encoding not with the ARMv8-A.
The base 32-bit RISC-V encoding may be compared with the ARMv8-A, because only it can have comparable performance.
All the comparisons where RISC-V has better code density compare the compressed encoding with the 32-bit ARMv8-A. This is a classical example of apples-to-oranges, because the compressed encoding will never have a performance in the same league with ARMv8-A.
When the comparisons are matched, 16-bit RISC-V encoding with 16-bit ARMv8-M and 32-bit RISC-V with 32-bit ARMv8-A, RISC-V always loses in code density in both comparisons, because only the RISC-V branch instructions are frequently shorter than those of ARM, while all the other instructions are frequently longer.
There are good reasons to use RISC-V for various purposes, where either the lack of royalties or the easy customization of the instruction set are important, but claiming that it should be chosen not because it is cheaper, but because it were better, looks like the story with the sour grapes.
The value of RISC-V is not in its instruction set, because there are thousands of people who could design better ISAs in a week of work.
What is valuable about RISC-V is the set of software tools, compilers, binutils, debuggers etc. While a better ISA can be done in a week, recreating the complete software environment would need years of work.
That's 100% nonsense. They have the same performance and in fact, some pipelines can get better performance because they fetch a fixed number of bytes and with compressed instructions, that means more instructions fetched.
The rest of the argument falls apart resting on this fallacy.
If you want to use a RISC-V at a performance level good enough for being used in something like a mobile phone or a personal computer, you need to simultaneously decode at least 8 instructions per clock cycle and preferably much more, because to match 8 instructions of other CPUs you need at least 10 to 12 RISC-V instructions and sometimes much more.
Nobody has succeeded to simultaneously decode a significant number of compressed RISC-V instructions and it is unlikely that anyone would attempt this, because the cost in area and power of a decoder able to do this is much larger than the cost of a decoder for simultaneous decoding of fixed-length instructions.
This is the reason why also ARM uses a compressed encoding in their -M CPUs for embedded applications but a 32-bit fixed-length encoding in their -A CPUs for applications where more than 1 watt per core is available and high performance is needed.