How the model is used isn’t relevant if creating it was already infringement. Training on works creates something of value and artists want to be able to prevent that training without compensation. There’s a long history of case law around just how much of someone’s work can be copied before it’s a problem. But here it’s literally the entire work being used so ‘how much’ is just everything.
The points you bring up are also relevant but artists don’t want to look through a billion individual images to see if that specific image happens to infringe on their work.
Edit: Wrote the response to a comment that got deleted before I posted presumably because I edited this one: IMO many commentators are getting this wrong.
“the less likely it is that the appropriation will serve as a substitute for the original work or its plausible derivatives, shrinking the market opportunities for the copyrighted work” https://www.supremecourt.gov/opinions/22pdf/21-869_87ad.pdf
The form of these models is very different, but the purpose is to create directly competing works. Each individual output may not directly infringe with a specific work, but the goal of the model very much is.
The comment brought up commentary about: https://en.wikipedia.org/wiki/Andy_Warhol_Foundation_for_the...