undefined | Better HN

0 pointssebastiennight1y ago0 comments

This is incorrect. If you take the most basic interpretation of an LLM at temperature 0 as predicting the most likely token, and you run it on, say, 1,000 runs of "complete this Spanish sentence with the word for 'X'", then:

- maybe ALL humans would fail the test in some way, eg. let's say everybody gets at least 10 of those wrong, and the average person gets 100 of those wrong.

- still, as long as most people correctly get each word right, your LLM would get every single response correct (because for each item in the test, 900+ people out of a thousand gave the same correct answer in the training set).

In that sense, it's totally possible for a system trained on a vast vat of average-human input to generate super-human outputs.

0 comments

sambapa1y ago

But still, the questions in that test are "solved" in the sense of "I can take a dictionary and answers these questions with full certainty". Beyond established knowledge LLMs are monkeys with typewriters, at best.

yunwal1y ago

I’d like to see you ace even a middle-school level Spanish test with just a dictionary (sub Spanish with some other language if you happen to know Spanish).

sambapa1y ago

It was a figure of speech. But there is nothing superintelligent about acing Spanish tests. Give me a Riemann hypothesis.

1 more reply

j / k navigate · click thread line to collapse

0 pointssebastiennight1y ago0 comments

- maybe ALL humans would fail the test in some way, eg. let's say everybody gets at least 10 of those wrong, and the average person gets 100 of those wrong.

In that sense, it's totally possible for a system trained on a vast vat of average-human input to generate super-human outputs.

0 comments

sambapa1y ago

yunwal1y ago

I’d like to see you ace even a middle-school level Spanish test with just a dictionary (sub Spanish with some other language if you happen to know Spanish).

sambapa1y ago

It was a figure of speech. But there is nothing superintelligent about acing Spanish tests. Give me a Riemann hypothesis.

1 more reply

j / k navigate · click thread line to collapse