undefined | Better HN

0 pointsdgfitz8mo ago0 comments

And magic tricks look like magic. Turns out they’re not magical.

I am so floored that at least half of this community, usually skeptical to a fault, evangelizes LLMs so ardently. Truly blows my mind.

I’m open to them becoming more than a statistical token predictor, and I think it would be really neat to see that happen.

They’re nowhere close to anything other than a next-token-predictor.

0 comments

svara8mo ago

> I’m open to them becoming more than a statistical token predictor, and I think it would be really neat to see that happen

What exactly do you mean by that? I've seen this exact comment stated many times, but I always wonder:

What limitations of AI chat bots do you currently see that are due to them using next token prediction?

dangus8mo ago

I feel like the logic of your question is actually inverted from reality.

It’s kind of like you’re saying “prove god doesn’t exist” when it’s supposed to be “prove god exists.”

If a problem isn’t documented LLMs simply have nowhere to go. It can’t really handle the knowledge boundary [1] at all, since it has no reasoning ability it just hallucinates or runs around in circles trying the same closest solution over and over.

It’s awesome that they get some stuff right frequently and can work fast like a computer but it’s very obvious that there really isn’t anything in there that we would call “reasoning.”

[1] https://matt.might.net/articles/phd-school-in-pictures/

svara8mo ago

Not at all.

I don't want to address directly your claim about lack of generalization, because there's a more basic issue with the GP statement. Even though I will say, today's models do seem to generalize quite a bit better than you make it sound.

But more importantly, you and GP don't mention any evidence for why that is due to specifically using next token prediction as a mechanism.

Why would it not be possible for a highly generalizing model to use next token prediction for its output?

That doesn't follow to me at all, which is why the GP statement reads so weird.

2 more replies

frm888mo ago

Thank you for that link. So very true. (I admit, I laughed)

greesil8mo ago

Maybe thinking needs a Turing test. If nobody can tell the difference between this and actual thinking then it's actually thinking. /s, or is it?

dangus8mo ago

This is like watching a Jurassic Park movie and proclaiming “if nobody can tell the difference between a real dinosaur and a CGI dinosaur…” when literally everyone in the theater can tell that the dinosaur is CGI.

sitkack8mo ago

If I order Chinese takeout, but it gets made by a restaurant that doesn't know what Chinese food tastes like, then is that food really Chinese takeout?

chpatrick8mo ago

If it tastes like great Chinese food (which is a pretty vague concept btw, it's a big country), does it matter?

dangus8mo ago

Useless analogy, especially in the context of a gigantic category of fusion cuisine that is effectively franchised and adapted to local tastes.

If I have never eaten a hamburger but own a McDonald’s franchise, am I making an authentic American hamburger?

If I have never eaten fries before and I buy some frozen ones from Walmart, heat them up, and throw them in the trash, did I make authentic fries?

Obviously the answer is yes and these questions are completely irrelevant to my sentience.

1 more reply

BoiledCabbage8mo ago

> I am so floored that at least half of this community, usually skeptical to a fault, evangelizes LLMs so ardently. Truly blows my mind. ... > I’m open to them becoming more than a statistical token predictor, and I think it would be really neat to see that happen

I'm more shocked that so many people seem unable to come to grips with the fact that something can be a next token predictor and demonstrate intelligence. That's what blows my mind, people unable to see that something can be more than the sum of its parts. To them, if something is a token predictor clearly it can't be doing anything impressive - even while they watch it do I'm impressive things.

seadan838mo ago

> I'm more shocked that so many people seem unable to come to grips with the fact that something can be a next token predictor and demonstrate intelligence.

Except LLMs have not shown much intelligence. Wisdom yes, intelligence no. LLMs are language models, not 'world' models. It's the difference of being wise vs smart. LLMs are very wise as they have effectively memorized the answer to every question humanity has written. OTOH, they are pretty dumb. LLMs don't "understand" the output they produce.

> To them, if something is a token predictor clearly it can't be doing anything impressive

Shifting the goal posts. Nobody said that a next token predictor can't do impressive things, but at the same time there is a big gap between impressive things and other things like "replace very software developer in the world within the next 5 years."

bondarchuk8mo ago

I think what BoiledCabbage is pointing out is that the fact that it's a next-token-predictor is used as an argument for the thesis that LLMs are not intelligent, and that this is wrong, since being a next-token-predictor is compatible with being intelligent. When mikert89 says "thinking machines have been invented", dgfitz in response strongly implies that for a for thinking machines to exist, they must become "more than a statistical token predictor". Regardless of whether or not thinking machines currently exist, dgfitz argument is wrong and BoiledCabbage is right to point that out.

2 more replies

tim3338mo ago

IMO gold?

chpatrick8mo ago

When you type you're also producing one character at a time with some statistical distribution. That doesn't imply anything regarding your intelligence.

j / k navigate · click thread line to collapse

0 comments

svara8mo ago

> I’m open to them becoming more than a statistical token predictor, and I think it would be really neat to see that happen

What exactly do you mean by that? I've seen this exact comment stated many times, but I always wonder:

What limitations of AI chat bots do you currently see that are due to them using next token prediction?

dangus8mo ago

I feel like the logic of your question is actually inverted from reality.

It’s kind of like you’re saying “prove god doesn’t exist” when it’s supposed to be “prove god exists.”

It’s awesome that they get some stuff right frequently and can work fast like a computer but it’s very obvious that there really isn’t anything in there that we would call “reasoning.”

[1] https://matt.might.net/articles/phd-school-in-pictures/

svara8mo ago

Not at all.

But more importantly, you and GP don't mention any evidence for why that is due to specifically using next token prediction as a mechanism.

Why would it not be possible for a highly generalizing model to use next token prediction for its output?

That doesn't follow to me at all, which is why the GP statement reads so weird.

2 more replies

frm888mo ago

Thank you for that link. So very true. (I admit, I laughed)

greesil8mo ago

Maybe thinking needs a Turing test. If nobody can tell the difference between this and actual thinking then it's actually thinking. /s, or is it?

dangus8mo ago

sitkack8mo ago

If I order Chinese takeout, but it gets made by a restaurant that doesn't know what Chinese food tastes like, then is that food really Chinese takeout?

chpatrick8mo ago

If it tastes like great Chinese food (which is a pretty vague concept btw, it's a big country), does it matter?

dangus8mo ago

Useless analogy, especially in the context of a gigantic category of fusion cuisine that is effectively franchised and adapted to local tastes.

If I have never eaten a hamburger but own a McDonald’s franchise, am I making an authentic American hamburger?

If I have never eaten fries before and I buy some frozen ones from Walmart, heat them up, and throw them in the trash, did I make authentic fries?

Obviously the answer is yes and these questions are completely irrelevant to my sentience.

1 more reply

BoiledCabbage8mo ago

seadan838mo ago

> I'm more shocked that so many people seem unable to come to grips with the fact that something can be a next token predictor and demonstrate intelligence.

> To them, if something is a token predictor clearly it can't be doing anything impressive

bondarchuk8mo ago

2 more replies

tim3338mo ago

IMO gold?

chpatrick8mo ago

When you type you're also producing one character at a time with some statistical distribution. That doesn't imply anything regarding your intelligence.

j / k navigate · click thread line to collapse