I think most HPC people would disagree with this statement. State-of-the-art HPC code is still written in ASM (see e.g., https://github.com/xianyi/OpenBLAS) [that's what Intel is doing too]
ASM makes sense when the time spent in a specific routine exceeds the time it takes to write the ASM, which makes a lot of sense for Blas, less so for other HPC yet speculative or less fundamental projects. Cvodes for instance doesn’t need to be written in ASM, and I think Julia makes a strong case that it could have been written in Julia.
I don't think they would. I think they realize that state-of-the-art HPC code is a small fraction of all the code written. I doubt that these people write ASM instead of Python or JS or C or whatever when doing simple scripts.
That ASM code is however not necessarily constructed manually. You'd think for high performance code with limited scope, a superoptimizer would be used.
Not sure what a "superoptimizer" would look like in this context. For a reference, I know for sure that this https://github.com/giaf/blasfeo (which beats Intel MKL) was coded entirely by hand.