I think that 'dutch people skating on a lake' or 'girl with a pearl earring' or 'dutch religious couple in front of their barn' without having an AI trained on various works will produce just noise. And if those particular works (you know the ones, right?) were not part of the input then the AI would never produce anything looking like the original, no matter how specific you made the prompt. It takes human input to animate it, and even then what it produces to me does not look original whereas any five year old is able to produce entirely original works of art, none of which can be reduced to a prompt.
Prompts are instructions, they are settings on a mixer, they are not the music produced by the artists at the microphones.