story
GPT-2 was trained on TPUs. (There are explicit references to TPUs in the source code: https://github.com/openai/gpt-2/blob/0574c5708b094bfa0b0f6df...)
GPT-3 was trained on a GPU cluster probably because of Microsoft's billion-dollar Azure cloud credit investment, not because it was the best choice.
To be fair, TPUv4 is not out yet, and it might catch up using the latest processes (7nm TSMC or 8nm Samsung).
For MLPerf 0.7, it's true that Google's software isn't available to the public yet. That's because they're in the middle of transitioning to Jax (and by extension, Pytorch). Once that transition is complete, and available to the public, you'll probably be learning TPU programming one way or another, since there's no other practical way to e.g. train a GAN on millions of photos.
You'd think people would be happy that there are realistic alternatives to nVidia's monopoly for AI training, rather than rushing to defend them...
Wait, what? Why would transition to Jax imply transition to Pytorch?
Pointing this out is not aggressive.