It's not just about the link speeds, it's about the topologies used.
Google style infrastructure uses aggregation trees. This works well for fan out fan back in communication patterns, but has limited bisection bandwidth at the core/top of the tree. This can be mitigated with clos networks / fat trees, but in practice no one goes for full bisection bandwidth on these systems as the cost and complexity aren't justified.
HPC machines typically use torus topology variants. This allows 2d and 3d grid style computations to be directly mapped onto the system with nearly full bisection bandwidth. Each smallest grid element can communicate directly with its neighbors each iteration, without going over intermediate switches.
Reliability is handled quite a bit different too. Google style infrastructure does this with elaborations of the map reduce style: spot the stranglers or failures, reallocate that work via software. HPC infrastructure puts more emphasis on hardware reliability.
You're right that F32 and F64 performance are more important on HPC, while Google apps are mostly integer only, and ML apps can use lower precision formats like F16.