> most of the cost is actually to do with loading and parsing classes
It boggles my mind that loading and parsing even a few thousand classes is something that a human can perceive as "dog slow" when carried out by a quad-core 1.2 GHz CPU.
My understanding is that it's I/O bound, that the JVM more or less fetches each class from the JAR file or disk individually.
I am not sure if that's an implementation decision or whether it's imposed by the JVM specification. At some point I want to tinker with OpenJ9 to find out.