When learning drawing I gradually got a sense of what is really going on is that I'm gaining a more conscious command of different shapes, just like when I learned to write letters; but instead of abstract marks, I'm learning the shape of hands, arms, etc - and from various perspectives. And so if I study a lot of the same shapes in a topic like anatomy or wildlife, I can replicate them from memory with fairly accurate proportions.
The difference between me and the AI, in its current form, is that the AI continues along the path of being an extremely smart shape recognizer and reproducer(as it should be, given some the first applications of the tech were to text recognition). So it can output a lot of details I can't(without lots of reference) and blend in stylistic ideas I'm unaware of. But I, while having a much more limited visual library, can mix in more details of the perspective, how anatomy and clothing work, and other kinds of logic. I can push the shapes to convey specific action and expression, design lighting situations and so on.
AI's ability to do it all in one step gives it a result that is very "savant", because it doesn't know what is and isn't a coherent image, but it has total mastery at making the shapes and applying rendering. Some of the things I've seen it do to prompts are wildly creative in interpretation as a result. It's a good tool.