This isn't true. A curated data set can greatly increase learning efficiency in some cases, but it's not strictly necessary and represents only a fraction of how people learn. Additionally, all curated data sets were created by humans in the first place, a feat that language models could never achieve if we did not program them to do so.