> Context switching cost due to CPU cache thrashing doesn't go away regardless of which type of thread you're using.
Except it's not a context switch? You're jumping to another instruction in the program, one that should be very predictable. You might lose your cache, but it will depend on a ton of factors.
> there's a new cost caused by allocating the internal continuation object, and copying state into it.
This is more of a problem with the implementation (not every virtual thread language does it this way), but yeah this is more overhead on the application. I assume there's improvements that can be made to ease GC pressure, like using object pools.
Usually virtual threads are a memory vs CPU tradeoff that you typically use in massively concurrent IO-bound applications. Total throughput should take over platform threads with hundreds of thousands of connections, but below that probably perform worse, I'm not that surprised by that.