As an aside, I do believe that LLM trainers are ignoring and violating many licenses, but open-source software is not a clear example of a violation.
Copyright protects only arbitrarily non-trivial parts of the original being reproduced, but that means that you have to be careful with learning from copyrighted material. Programming books will have direct clauses allowing snippet reuse, but not for teaching purposes.
This was a different argument. And there is no contradiction to separate LLMs and people.
> As an aside, I do believe that LLM trainers are ignoring and violating many licenses, but open-source software is not a clear example of a violation.
How?
Even if they did, if someone memorized copywritten code and then typed it back out that would still be a copywrite violation