So, yes, horizontal scaling is good, especially for stateless workloads - but that doesn't mean you run the most hopelessly under-performing code imaginable on each node, so you basically have to scale out like this! I mean, seriously, 4000 containers to serve 4000 concurrent requests? I mean, I can't even...
I honestly can't believe the attempts in this thread to justify such an utterly, horrendously bad architecture - there are 1001 better, simpler even, ways to approach this.
Yes, premature optimisation is bad, but optimisation here was nowhere near premature.