I read an interesting paper recently that had a great take on this: If you add enough data, nothing is outside training data. Thus solving the generalization problem.
Wasn’t the main point of that paper, but it made me go ”Huh yeah … I guess … technically correct?”. It raises an interesting thought that yes if you just train your neural network on everything, then nothing falls outside its domain. Problem solved … now if only compute was cheap.