Sure, I was saying "better" in the sense that if for X task, it can do better than Y% of humans.
> since we hand-fed it the answers, it falls a little flat for me
We didn't really hand-fed it any answers though did we? If you put a human in a white box all its life, with access to the entire dataset on a screen but no social interaction, nothing to see aside from the text, nothing to hear, nothing to feel, nothing to taste, etc, it'd be very impressed if they were then able to create answers that seem to display such thoughtful and complex understanding of the world.