undefined | Better HN

0 pointsnext_xibalba3y ago0 comments

They trumpet the exam results, but isn't it likely that the model has just memorized the exam?

0 comments

It's trained on pre-2021 data. Looks like they tested on the most recent tests (i.e. 2022-2023) or practice exams. But yeah standardized tests are heavily weighed towards pattern matching, which is what GPT-4 is good at, as shown by its failure at the hindsight neglect inverse-scaling problem.

allthatisreal3y ago

I believe they showed that in GPT4 reversed the trend on the hindsight neglect problem. Search for "hindsight neglect" in the website and you can see that it's accuracy on the problem shot up to 100%.

qt314159263y ago

oh my bad, totally misread that

pphysch3y ago

Well, yeah. It's a LLM, it's not reasoning about anything.

j / k navigate · click thread line to collapse

0 comments

qt314159263y ago

allthatisreal3y ago

qt314159263y ago

oh my bad, totally misread that

pphysch3y ago

Well, yeah. It's a LLM, it's not reasoning about anything.

j / k navigate · click thread line to collapse