Here's a thing though: for 99%+ of that content, being turned into feedstock for ML model training is about the only valuable thing that came of its existence.
If it were not for world-ending danger of too smart an AI being developed too quickly, I'd vote for exempting ML training from copyright altogether, today - it's hard to overstate just how much more useful any copyrighted content is for society as LLM training data, than as whatever it was created for originally.