I'm not sure how your proposal would actually work. To recognize plagiarism during inference it needs to memorize harder.
Kinda funny if it works though. We'd first train them to copy their training data verbatim, then train them not to.
That is how it works, right? They're trained to copy their training data verbatim because that's the loss function. It's just that they're given so much data that we don't expect this to be possible for most of the training data given the parameter count.