There can be no such thing. Both fonts accomplish a purpose. Many like one because they feel it is easier to design with. Any reputable designer will instantly recognize both. They are two of the most used fonts in the world. There is no way to perform a blind test.
The closest thing to reputable researchers in design are designers. They spend their lives experimenting with fonts, and most prefer to use the one to the other. Some of them are following popular opinion mindlessly. Many aren't. Until some metric can be made for good design (fat chance. Also, note that readability is not the only important piece of this puzzle) no journal-worthy piece of science is going to come forward.
No one is asking about "reputable designers". The tiny minority of people who can recognize hundreds of fonts at a glance are not interesting when it comes to this question.
There are certainly some objective metrics you can consider. Some that very quickly come to my mind:
1) Effect on reading speed 2) Effect on reading "endurance" (a little harder to define, but reasonable) 3) Effect on recall of text (prose) 4) Effect on recall in advertisements 5) Distance at which text of a given size becomes readable
There's also softer stuff along the lines of "does it make you feel warm and fuzzy", and you can still design experiments for that type of thing too.
So, I'm sorry, but I just don't believe you went you say you can't measure any of this.
I am advocating seeing font as a tool designers use, not an end-user product. When I choose to develop a program using lisp instead of python (or vice-versa), I pick based on preference, knowing that I can produce a competant program with either --- though those programs will certainly differ somewhat --- not because of end-user useability studies of programs created with each.
I believe that equally readable, memorable, aesthetically pleasing design can be produced with either font; they are, after all, very similar. Designers find one more pleasant to use.