They did PCA over two sets of metrics, taking the top 25 components from each set and then combined that into a 50d space. Using these dimensions and measured responses to fit a model resulted in explaining 57% of variance in real cell firing rates. (Much better than other models including a 5 layer CNN).
This is pretty cool. I'd like to see a follow-up where the chosen dimensions were further refined using something a bit more iterative that an arbitrary PCA cutoff.
Also I really want to know what eye motion was present during each trial. This paper presents a very "instantaneous" recognition perspective and doesn't talk about integration over time or the impact of sequential perception of face components on recognition. (Eg an upside-down face is hard to recognize because your gaze has to move up from the eyes to see the mouth which is a sequence rarely encountered in the real world)
Incredibly important point. Looking at the physiology of the eye and the early visual system all together it's not clear that we can see anything without motion: object motion or saccade (saccade appears to be fundamental). Start with Lettvin's famous frog's eye paper.
More interestingly, it does appear that recognition of a 2D image may be a complex, abstract learned behavor, while 3D recognition is at its core innate. And that writing (well, drawing) preceded reading.
stimulus size spanned 5.7 degrees. The fixation spot size was 0.2 degrees in diameter and the fixation window was a square with the diameter of 2.5 degrees.
2.5 is relatively large in my experience, might be due to lots of noise on the signal, but anyway there won't be true saccades (well, there shouldn't be, parts of recorded data wehere with saccades outside of that fixation window should get rejected). Probably microsaccades were present, but those aren't mentioned.
Oh, BTW I do ML consulting lol
Inception-v3 has about 50 layers, and that's considered a lot (requires considerable processing power to train).
Reading the Cell article on this [0] I couldn't help to see the similarities with OpenFace [1].
[0] http://www.cell.com/cell/fulltext/S0092-8674(17)30538-X [1] https://cmusatyalab.github.io/openface/#overview
Advances in machine learning have been made by training a computerized mimic of a neural network on a given task. Though the networks are successful, they are also a black box because it is hard to reconstruct how they achieve their result.
“This has given neuroscience a sense of pessimism that the brain is similarly a black box,” she said. “Our paper provides a counterexample. We’re recording from neurons at the highest stage of the visual system and can see that there’s no black box. My bet is that that will be true throughout the brain.”
I'm curious how this study would explain or contradict the results of that study. Also, were the monkeys raised by monkey parents or human scientists? Monkeys that were allowed to imprint on humans might be more similar to humans and, yet, unrepresentative of monkeys.
[1] I think it was https://www.ncbi.nlm.nih.gov/pubmed/15943669
But that is just a guess and might also go against the findings of the article, which I haven't fully understood yet.
In any case it seems to me that the brain is optimized for processing certain features, as apparently there are people who are unable to recognize faces (face blindness).
"These dimensions create a mental “face space” in which an infinite number of faces can be recognized. There is probably an average face, or something like it, at the origin, and the brain measures the deviation from this base."
"Dr. Tsao said she was particularly impressed to find she could design a whole series of faces that a given face cell would not respond to, because they lacked its preferred combination of dimensions. This ruled out a possible alternative method of face identification: that the face cells were comparing incoming images with a set of standard reference faces and looking for differences."
I'm surprised that they didn't attempt to generate a face with exactly 0 on all dimensions.
It would be fascinating to know what the most memorable face looks like - and if it's different per-brain. (Presumably it is monkey shaped!)
Only few days ago there was a similar study reading faces from human brains by trying to construct a latent space:
https://arxiv.org/abs/1705.07109
https://twitter.com/ccnlab/status/866548346751725568 (animation, over time more dimensions from this latent space (afaik PCA components) are added)
Given the limitations of fMRI (we can not do single cell recordings in human brains) the results are not as accurate, but to my knowledge this is the best we can do in humans so far.