> Remember all the "no more data" craze? Despite no actual researcher worth their salt saying it or even hinting at it?
We ran out of fresh interesting data. A large chunk of training needs to generate its own now. Synthetic data training became a huge thing over the last year.
> Remember the "hitting walls" rhetoric?
Since then the basic training slowed down a lot and improvements are more in the agentic and thinking solutions, with lots more reinforcement training than in the past.
The fact we worked around those problems doesn't mean they weren't real. It's like people say Y2K wasn't a problem... ignoring all the work that went into preventing issues.