It seems unfortunately clear that generative ML as typically practiced falls under fair use of even the most restrictive licenses or lack thereof (e.g. a training set including disney movies without disney’s permission). Some people say that’s great and it’s legal hooray, but I would love it if the law caught up and added requirements to the models trained this way. If you benefit from other people’s stuff without their permission then you ought to have to give back in some way.
If you can't prove your code was stolen you shouldn't have a claim. And Codex should just skip code that exists in the training set. All that remains is creative code.
What is actually crazy is having copyright/patents/whatever apply to mathematical structures and code, and be retainable for long, it's rent on ideas, such a ridiculous concept.
Copyright and patents are very different. I think the general consensus among developers is that software patents are silly, but copyright on source code is very important.