I agree that it should be possible to implement that for generative AI, although the training may become significantly more expensive in order to maintain that information, and the AI companies have little interest in doing so. They’ll probably rather try to heuristically assess possible copyright issues after the fact in a post-processing step.
The more interesting question is if copyright holders can claim unauthorized use of their works beyond the case of near-verbatim reproduction, because the works collectively inform the AI in a more general manner.