undefined | Better HN

0 pointsnonfamous3y ago0 comments

Ah, I had missed that interpretation. Although, that may explain why GPT-4 got it wrong: there's so much context in its training data about the relationship between lions and humans, and this puzzle specifically, that like this human its response was swayed...

0 comments

colinmorelli3y ago

But I think that's the whole point of the exercise? That GPT-4 is leaning on stringing tokens together in a reply rather than reasoning through the problem itself which, I would think, would be "required" for AGI (though we may end up finding out that well trained language models in specific domains eliminate the need for generalized cognition).

In any case, it's an interesting exercise regardless of your opinion/stance on the matter!

arrrg3y ago

But the human (in the comment chain) here made exactly the same mistake!

In that sense this test doesn’t seem to be a good fit for testing the reasoning capabilities. Since it‘s also easy to get wrong for humans (and humans also don’t always reason about everything from first principles, especially if they have similar answers already cached in their memory).

It seems you would need novel puzzles that aren’t really common (even if in kind) and don’t really sound similar to existing puzzles to get a handle on its reasoning capabilities.

colinmorelli3y ago

The human recognized that they made the mistake and fixed it. As mentioned in the original comment, GPT failed to recognize the mistake even after being told. That's the key here that indicates it can't "reason."

There are open questions about whether or not it really needs to reason given sufficient training, but that seems to be the gap here between the human and the machine.

1 more reply

famouswaffles3y ago

Bing/GPT-4 gets the answer right if you rewrite the problem in way that doesn't make it biased to common priors

Or just tell it it's making a wrong assumption.

1 more reply

j / k navigate · click thread line to collapse

0 comments

colinmorelli3y ago

In any case, it's an interesting exercise regardless of your opinion/stance on the matter!

arrrg3y ago

But the human (in the comment chain) here made exactly the same mistake!

It seems you would need novel puzzles that aren’t really common (even if in kind) and don’t really sound similar to existing puzzles to get a handle on its reasoning capabilities.

colinmorelli3y ago

There are open questions about whether or not it really needs to reason given sufficient training, but that seems to be the gap here between the human and the machine.

1 more reply

famouswaffles3y ago

Bing/GPT-4 gets the answer right if you rewrite the problem in way that doesn't make it biased to common priors

Or just tell it it's making a wrong assumption.

1 more reply

j / k navigate · click thread line to collapse