They're on a grid though, why would this be hard?
Compared to all the other feats of machine learning that have blown my mind, parsing a photo of a limited set of a handful of different piece variants, in two colors, that located on a grid, doesn't seem too difficult.