Google shows snippets of copyrighted work all of the time, and it certainly ingests the entire copyrighted work when googlebot views the page to index it. The only real issue here is that NYT figured out a way to get bingbot to look up an entire article from the internet and repeat it which may not be kosher. But if search engines can ingest the entire content of copyrighted works (subject to robots.txt) then I don't see why AI training should be different on that front.
Of course, the real reason it is different is that it impacts different interest groups than search engines, and the rule of law is a sham. Creatives will do anything to ensure they don't get disrupted and can continue extracting rent from society, and have learned a lot of tools of rhetoric from their fancy colleges to put to use in that effort, compared to the industrial workers who got disrupted by automation a generation ago.