To paraphrase, "Artists (and observers of art) get rewarded for making (and observing) novel patterns: data that is neither arbitrary (like incompressible random white noise) nor regular in an already known way, but regular in way that is new with respect to the observer's current knowledge, yet learnable (that is, after learning fewer computational resources are needed to encode the data)".
In other words, enjoyment of art is about learning (easy) patterns. Schmidhuber likens "fun" to the improvement of an observer's ability to compress a scene.
The source is worth the read: http://people.idsia.ch/~juergen/creativity.html
The comparisons to music theory in this thread are apt. Music theory is always behind music production.
How can you understand aesthetics without understanding creativity?
Obviously images convey much more information than music, so any theory that doesn't encompass the semantics of the subject will miss most of the signal. But is there a theory for the presentation and composition of the subject? To some degree, I'm confident there is.
Some of the methods used to debug the deep learning of images already do a fair job of showing the locus of focus in the image where the DNN found maximum information. I can see such a technique discovering many of the techniques used by artists and photographers to direct the observer's eye or juxtapose objects that conflict.
>No True Scotsman thinks that subject matter is irrelevant
fixed that for you
[1] Deep Visual-Semantic Alignments for Generating Image Descriptions - cs.stanford.edu/people/karpathy/cvpr2015.pdf
[2] Deep Learning for Content-Based Image Retrieval - www.research.larc.smu.edu.sg/mlg/papers/MM14-fp336-hoi.pdf
[3] Deep Learning for Content-Based Image Retrieval - www.cs.rutgers.edu/~elgammal/pub/MTA_2014_Saleh.pdf