Having a decent devkit may help bring more native ARM software, which may make these underpowered machines struggle less with speed and compatibility.
2 - Arm also implements weak ordering which is type of memory model. It allows instructions to be separated into groups based on whether they affect other instructions. This allows these groups to skip some waiting lines. (Magic). X86 has strong ordering, which can make it slower in specific scenarios.
TBH there are various mostly-Apple/TSMC-"exclusive" tricks too, for why their chips are better than the others:
A - On Apple Silicon, pages are larger, but not too large
B - there are various accelerators leveraged by libraries provided in the OS (or the provided toolchains, etc.)
C - Apple got to use the best TSMC process years before the competition.
D - TSMC is ahead (I'm curious of what Zen4 will give on laptop, btw)
So it's mixed. The ARM ISA probably plays a small role in the perf of Apple Silicon vs. x86 chips, but is probably not the main cause of the perf gap.
If the company is selling the whole device, including the chip, they can afford to throw more transistors and die area at performance with a much more minimal impact on their bottom line.
OTOH Qualcomm is constantly lagging behind Apple, promising that next year they will be almost as fast as a previous-gen Apple chip. The most generous explanation I've heard is that Qualcomm is specializing in quantity over quality, and focuses on making CPUs cheap, rather than fast.