undefined | Better HN

0 pointscivilized2y ago0 comments

> humans tend to agree that in a vast array of cases state of the art LLMs do predict tokens very well

This argument is backwards. Humans don't measure the next token prediction ability of the agents they speak to, human or AI. We rate speakers on whether they seem to understand what we say in context and respond by contributing useful information and analysis.

The attributes you're saying can be inferred from known superior next token prediction ability are the things we can actually detect and measure, at least qualitatively. Next token prediction quality is not measurable by humans in any human-meaningful way. Improving test cross entropy by 50% doesn't mean anything to us. It is irrelevant except as a mechanism to train LLMs.

0 pointscivilized2y ago0 comments

> humans tend to agree that in a vast array of cases state of the art LLMs do predict tokens very well

0 comments

4 comments · 2 top-level

TeMPOraL2y ago· 1 in thread

Point is that the simplest way to excel in next token prediction in the way human consider correct - which is rated by how people feel the predictor mimics a human understanding - is to actually have a world model and other components of human understanding.

Understanding and compression are the same thing. LLMs are fed a huge chunk of totality of human knowledge, and optimized to compress it well. They for sure aren't doing it by Huffman-encoding a multidimensional lookup table.

civilizedOP2y ago

> Point is that the simplest way to excel in next token prediction in the way human consider correct - which is rated by how people feel the predictor mimics a human understanding - is to actually have a world model and other components of human understanding.

This is a speculative theory for why a next token predictor might sound like it knows what it's talking about. Not something we actually know.

jameshart2y ago· 1 in thread

I mean, I think it was implied that humans judge the ‘next token prediction’ ability of LLMs as being good based on the quality of the overall output.

civilizedOP2y ago

In which case you have a trivial point rather than a backwards argument: "the output seems like it knows what's it talking about, and the easiest way to explain that is if it really knows what it's talking about."

j / k navigate · click thread line to collapse