But that's still different from taking in the actual JPEG data, which is sort of what the parent gets at: something has to decode that, and that software isn't a neural net.
(Further, when I worked w/ ML that dealt w/ image data, we had a host of non-ML code written around it to support it, dealing with the various facets of running in the cloud, where to get the data, where to store the results, who to notify about the results, and a bunch of preprocessing on the image — such as removing pointless borders that humans put around images.)