I know this won't happen, of course, I am moreso hoping for laws to be updated to avoid similar kerfuffles in the future, as well as massive fines to act as a deterrent, but I don't dare to hope too much.
> have all of OpenAI's data for free
Doesn't really fit. Perhaps OpenAI might successfully prevent us from accessing it, but it wouldn't be "theirs" and we couldn't "have" it.
I'm not sure what kind of conversations we will be having instead, but I expect they'll be more productive than worrying about ownership of something you can't touch.
Is that understanding correct?
Anyways, the laws are mature enough for everyone to work this out in court. Maybe it comes out that they have a legitimate concern, but the way they presented their evidence so far in public has seriously been lacking.
Rather, the actual culprit is almost certainly overfitting. The articles in question were pasted many times on different websites, showing up in the training data repeatedly. Enough of this leads to memorization.