The crazy thing about these models is that the compute power going into them is at least somewhat reversible.
How Well Can DeepMind's AI Learn Physics? https://www.youtube.com/watch?v=2Bw5f4vYL98 https://arxiv.org/abs/2002.09405 https://sites.google.com/corp/view/learning-to-simulate/home
Discovering Symbolic Models from Deep Learning (Physics) https://www.youtube.com/watch?v=HKJB0Bjo6tQ
Scientific Machine Learning: Physics-Informed Neural Networks with Craig Gin https://www.youtube.com/watch?v=RTPo6KgpvBA
Steve Brunton's channel is even more mind blowing than Two Minute Papers, https://www.youtube.com/@Eigensteve
Not only can we bank computation, speed up physical simulations by 100x but I also saw some work on being able to design outcomes in GoL (game of life).
There was a paper on using a NN to build or predict arbitrary patters in GoL, but I can't find it right now.
The large datasets involved let us usefully (for some value of useful) bank lots of compute, but it's not obvious to me that it's done particularly efficiently compared to other things you might precompute.
For converged model training, training is often quite inefficient because the weight updates decay to zero and most epochs are having a very small individual effect. I think for e.g. stable diffusion, they dont train to anywhere near convergence so weight updates have a bigger average effect. Not sure if that applies to llms