Then imagine how much performance you could get out of OS level VMs that understand the processes at VM level (ie. can access code in some IR that they can analyze easily, recompile it on the fly, etc.) there is already stuff like this in specialized markets (eg. kernel level GC for JVM) but it's still fairly specific.
Then there's all the shitty legacy abstraction layers in things like filesystems - ZFS is a perfect example of what kind of gains you can get for free if you just rethink the design decisions behind current stack and see what applies and what doesn't.
If the benefit of rewriting these systems ever overcomes the cost - we have huge potential areas for performance gains, modern systems are very far from being performance efficient, they are efficient based on various other factors (development cost, compatibility, etc.)
I also wish ZFS would grow an encryption layer (one that isn't based on Sunacle's implementation, since Sunacle doesn't want to share that one thus no one can use it).
Compare that to some of the code people ran through 6502-derivatives.
Abstraction may be more efficient in terms of programmer time, and performance efficiency may be high enough so as to be immaterial, but the two shouldn't be conflated.
Reminds me of a version of this image[1] which has a discussion superimposed over it that says, "but if he had a big enough pile of ladders he could get over the wall!" and someone responds, "welcome to Android optimization." I think we see something similar with Javascript performance.
There are other materials to make chips out of besides silicon, gallium arsenide and carbon for instance, each of which has different scaling properties.
There's also ways to make chips more dense by stacking wafers instead of trying to shrink features.
Stacking is certainly a thing and it's good for memory (see AMD's newest graphics card) but power dissipation provides limits in terms of how much high speed logic you can put under a given area.
It has a great graph of engineering effort vs Moore's law which made it cheaper to just wait for a faster chip then put in the effort.