Based upon what? You think other publishers use NYTimes articles for free without license?
Presumably, if it can remember at least a paragraph or two of each article, then surely the same would be true of any text it ingested and the model size would approach the dataset size (probably actually much larger). I don't believe this is the case at all, even searching around, I've not found any good recent examples of it regurgitating copyrighted text verbatim.
It's cool to hate AI stuff if you're a creative atm. But gotta love those generative/algorithm based PS brushes, that's still real art!
"Indeed, the opening paragraph of "A Game of Thrones" by George R.R. Martin, with the chapter titled "Bran," starts as follows:
"The morning had dawned clear and cold, with a crispness that hinted"
And then it cuts off, whether that's because OAI now have an oh shit filter or just the model had access to the first page or publicly available articles quoting the first line, I'm not sure.
I tried other chapters and random sections and it could get a sentence or two right but then hallucinated; what's more likely NYT and GRRM? That your works are being reproduced verbatim? Or that Facebook, YouTube descriptions, fan tumblrs and hell, the publicly available and multiple GoT related wikis that include a variety of passages from the books were used as training data?