That's really no different than somebody uploading proprietary code they don't own (stolen, leaked, whatever reason etc) on Github. Github has to assume that you are allowed to do so. What are they going to do otherwise, somehow manually verify that each repository is legit?
Now you might say, what about GPL code you don't own. You are allowed to redistribute it (upload to github). But because you are not the owner you can't license it to Github under new terms (that allow them to use it for ML training). But the question still is, is there anything in the GPL that forbids it's code being used for ML training? Even if the generated model is proprietary, has no attributions, etc?