Ask HN: How exactly does Google serve traffic?
Here's the flow:
* A user requests google.com
* DNS lookup happens
* Get redirected to some IP [172.217.23.164 in Google's case]
* That IP is allocated to a server and a "Connection" will be open from Client to the server.
My understanding is that a "connection" (from the client to Google's public IP) is open till the public facing IP forwards the request to Load balancer -> Load Balancer to APP Server and finally app server replies with the content. After the reply is sent, the connection is still open for a small amount of time (say 30 seconds) for subsequent requests. Can someone explain, how exactly is one server able to handle all this.. I fully expect it to be a separate hardware just for this but still it seems to be able to handle such a large number of requests on its own.
Learning and studying Linux, I see the max. connections one box (or an interface to be exact) can have is ~65k (65535 minus used ports). So I know Google wont be using out of the box linux, but how is Google able to handle ~10m + simultaneous connections (This is my guess) with only one IP.
Suppose I have a website and it has to have only one public facing server (which will be an HAProxy instance) it'll pass the traffic to backend load balancers [segregated by application server roles]. Works in theory but still unable to understand the public facing server's "Connection" issue. Suppose there are 1m users trying to hit the public site, the connection will be open from the time they connect till the time I send request back and connection wait is there. I tried reading for c10m and c10k but still need a more easy to grasp understanding.
Thanks for reading.