> The utilization argument is specious.
It's not. Utilization is a key metric in capacity planning of large scalable apps.
Capacity is based upon max utilization. A scaled web app is does not have constant utilization. The parent I was responding to suggested running on one large/face instance. Ok... if you're capacity planning, are you planning for peak rps or min rps? Obviously peak. Peak times are always a fraction of your total server uptime. This means one big/fast server would be underutilized most of the time.
How do you expect to dynamically vertically scale in cloud to fit demand while using a single server? Re-provision another server (either smaller or larger), redeploy all apps to the server, and then route traffic? Great, you're doing kubernetes job by hand.