Thanks for sharing! So roughly 20% slower for computationally intensive workloads, likely due to nested paging putting increased pressure on the TLB. For applications using huge pages, the slowdown would likely be much less. Both docker and KVM introduce a lot of overhead with frequent, small IOs. That's likely the chattiness of the syscalls with the Kernel, which is a problem even without virtualization. Doing more work per syscall reduces those overheads. e.g. writev, readv, sendmmsg recvmmsg, etc. The context switch involved in syscalls (especially the cache and TLB pollution they cause) is very expensive.