No legal precedent has been set as of yet. The "precedent" you describe is the argument AI companies have been using (that training their models on information available on the Internet should be considered "fair use") but whether AI training actually satisfies the four-factor test for fair use remains to be seen.
It's a null question. Training itself is neither publication nor distribution, so copyright can't be relevant at that point. "Fair use" just isn't a concept applicable to training.