https://arstechnica.com/features/2025/06/study-metas-llama-3...
So they fed "It takes a great deal of bravery to stand up to our " and the llm responded "enemies, but just as much to stand up to our friends".
They repeated that for every 100 tokens of the entire book. I think lots of fans could do just as well. It's pretty good evidence that the potter books were in the training corpus, but it's not quite what people think when they say an llm has 'memorized' something. It's not like getting even a few pages out of the model.
There might be ten million people who have quoted Harry Potter at some point in their blogs or forum posts. There are only so many words in the books.