This is suddenly a highly debatable claim. Traditional ML was "data hungry" and required massive datasets for features and difficult to acquire "clean" labels.
You can ask ChatGPT to generate these datasets now. However it's unclear if future models will rely on such datasets if large models already have the answer. Here is an example of a ChatGPT generated dataset for a misinformation classifier. ChatGPT Can generate 10 examples every time I hit enter, I asked it to create this dataset using the format.
sentence, misinformation (1,0), notes
---
The earth is flat, 1, This claim is widely debunked by scientific evidence and research.
Vaccines are harmful and cause autism, 1, This claim has been disproven by multiple scientific studies and is not supported by medical evidence.
The Holocaust never happened, 1, This claim is a historical denial and disregards the overwhelming evidence of the genocide of Jewish people during World War II.
Humans and dinosaurs coexisted, 1, This claim is not supported by scientific evidence and is a popular conspiracy theory.
Water boils at 212 degrees Fahrenheit, 0, This is the standard boiling point of water at sea level.
The moon landing was faked, 1, This claim is a conspiracy theory and is not supported by any evidence.
Climate change is not caused by human activity, 1, This claim is not supported by the overwhelming majority of scientific evidence and research.
The sun revolves around the earth, 1, This claim was disproven by scientific evidence in the 16th century and is now considered a flat-earth theory.
HIV does not cause AIDS, 1, This claim is not supported by scientific evidence and has been disproven by multiple studies.
Vaccines are safe and effective, 0, This claim is supported by the majority of scientific evidence and research.