Ironically, for the first time, I think I found some perspective to the remix argument here.
Normally it's just like you say: I don't find the remixing argument persuasive, because I consider it to be a point of commonality. This time however, my focus shifted a bit. I considered the difference in "source set".
To be more specific, it kind of dawned on me how peculiar it is to engage in creating art as a human given how a human life looks like. How different the "setup" is between a baby just kind of existing and taking in everything, which for the most part means supremely mundane, not at all artful or aesthetic experiences, and between an AI model being trained on things people uploaded. It will also have a lot of dull, irrelevant stuff, but not nearly in the same way or in the same amount, hitting at the same registers.
I still think it's a bit of a bird vs plane comparison, but then that is also what they are saying in a way. That it is a bird and a plane, not a bird and a bird. I do still take issue with refusing to call the result flight though, I think.