The cost of generating crypto is very real when you're talking about single-digit ms latencies :( RSA-2048 TLS certs add about 2-3ms to any connection, just on server-side compute, even on modern CPUs (Epyc Milan). (I believe a coworker benchmarked this after disbelieving how much compute I reported we were spending and found that it's something like 40x slower than ECDSA P256.)
> To go back to your HN example - HN loads fast because it is ONE IPv6 address (for me) and very lightweight so tcp slow start ramps up pretty darn fast, even going all the way to San Diego.
I used HN as an example not because it's bloated, but due to its singly-homed nature to illustrate how much content placement matters. Yeah, we could quibble about 80ms vs 65ms RTT from improving peering but the real win as I mentioned was in server placement. Throwing a CDN or some other reverse proxy in front of that helps as far as cacheability of your content / fanout but also for TCP termination near the users (which cuts down on those startup round trips). This is why I can even talk about Los Angeles for www.google.com serving even though we don't have any core datacenters there that host the web search backends.
(For what it's worth, I picked Chicago as a second location as "nominally good enough for the rest of the US". Could we do better? Absolutely. As you point out, major CDNs in the US have presence in most if not all of the major peering sites in the country, and either peer directly with most ISPs or meet them via IXes.)