undefined | Better HN

0 pointscypress663y ago0 comments

There's nothing "by definition" that says so.

In fact many propose that when you train an LLM, in order to be able to predict the next word with enough accuracy, it must internally build a world model.

Yann Lecun is very salty about chatgpt, I wouldn't take his word seriously.

0 comments

7 comments · 3 top-level

alfalfasprout3y ago· 4 in thread

Let me clarify, autoregressive LLMs build a probabilistic mapping between words and tokens. They don't actually understand what these concepts mean. Only what they appear in conjunction with, etc. We (and most animals) interact with the physical world and learn through a combination of doing, experiencing, biology, and book learning. That lets us reason about how things work in unseen contexts and we know what we know vs. don't know (whether we express it or not is a different story).

> Yann Lecun is very salty about chatgpt, I wouldn't take his word seriously. With all due respect, he's not salty at all. He's even overseen plenty of cutting edge research in the LLM space. But he rightfully has pointed out what they can and can't do.

There's too many people encountering a chatbot for the first time that sounds coherent and engaging in anthropomorphism.

rcme3y ago

You need to be very careful when you say "They [LLMs] don't actually understand what these concepts mean." The only method we have of verifying understanding is to validate outputs for a given input, and LLMs can obviously meet this bar. Unless you have another way?

YeGoblynQueenne3y ago

It's more like we don't have any way to "verify" understanding, or measure it. We can "validate" the outputs of an LLM, but what do those outputs mean? Who's to say? Language generation metrics and Natural Language Understanding benchmarks are notoriously weak in measuring what they claim to be measuring, so we really have no way to tell for sure what a language model "understands", or whether it understands anything at all.

Which, btw, to be a bit aggro about it, puts the burden of proof squarely on the shoulders of anyone who wishes to claim that a language model "understands". Else, one risks being hit with a small china teapot falling from space.

https://en.wikipedia.org/wiki/Russell%27s_teapot

Which might cause grave injury indeed.

1 more reply

d110af5ccf3y ago

> They don't actually understand what these concepts mean.

You say this so confidently. But can you define in terms that are directly quantifiable what "understanding a concept" actually means?

I don't believe that anyone can (at present, anyway) although there are certainly some interesting theories and heuristics that have been put forward by various people.

YeGoblynQueenne3y ago

>> You say this so confidently. But can you define in terms that are directly quantifiable what "understanding a concept" actually means?

Hold on there, can you "define in terms that are directly quantifiable what" 'God is real' "actually means"? If you can't, does that mean that atheists, like me, can't continue to say very confidently indeed that he doesn't?

Do I, as an atheist, need to provide proof of God's non-existence, or is it the job of people who believe in Gods to bring evidence of their existence?

And do you see the parallel here with what you are saying above? If you are saying that LLMs "understand" (you, or anyone else), why is it skeptics that have to provide evidence that they don't? You're the one who's making claims that can't be falsified.

Although I guess you have to agree with the general idea of falsifiability being better than the alternative, to see what I mean.

1 more reply

visarga3y ago

LLMs are models of language, and language is a model of the world. So we have a model of a model of the world, but a LLM does not get much grounding in the real world.

There is a phase of training called multi-task instruction tuning where the LLMs solve problems and thus are grounded in exact answers. That makes the difference between the difficult to handle GPT-3 from 2020 and the better behaved GPT-3 of 2022. But that dataset is small by comparison to the raw text used in pre-training, it won't do the grounding perfectly.

Real grounding comes from real feedback, even humans need the feedback or we are just going on wild tangents.

YeGoblynQueenne3y ago

>> In fact many propose that when you train an LLM, in order to be able to predict the next word with enough accuracy, it must internally build a world model.

Oh, but it does build a world model. Only, its "world", is a gigantic table of token collocations, and their probabilities. So, for example, it can tell you with great accuracy that "king - man + woman = queen", but that's the only way it can map "king" to something else: by moving around its embedding space, I guess. Unfortunately, if you can only map between tokens, when you have no representation of the meaning of those tokens, other than more tokens of which you don't have any other representation, well, then, any mapping you can build won't really help you understand what those tokens mean.

If only we could find a way to map tokens to real-world entities, or to some kind of representation of ... things... outside of token space.

(yes yes, the frame problem, old as AI)

j / k navigate · click thread line to collapse

0 comments

7 comments · 3 top-level

alfalfasprout3y ago· 4 in thread

There's too many people encountering a chatbot for the first time that sounds coherent and engaging in anthropomorphism.

rcme3y ago

YeGoblynQueenne3y ago

https://en.wikipedia.org/wiki/Russell%27s_teapot

Which might cause grave injury indeed.

1 more reply

d110af5ccf3y ago

> They don't actually understand what these concepts mean.

You say this so confidently. But can you define in terms that are directly quantifiable what "understanding a concept" actually means?

I don't believe that anyone can (at present, anyway) although there are certainly some interesting theories and heuristics that have been put forward by various people.

YeGoblynQueenne3y ago

>> You say this so confidently. But can you define in terms that are directly quantifiable what "understanding a concept" actually means?

Do I, as an atheist, need to provide proof of God's non-existence, or is it the job of people who believe in Gods to bring evidence of their existence?

Although I guess you have to agree with the general idea of falsifiability being better than the alternative, to see what I mean.

1 more reply

visarga3y ago

LLMs are models of language, and language is a model of the world. So we have a model of a model of the world, but a LLM does not get much grounding in the real world.

Real grounding comes from real feedback, even humans need the feedback or we are just going on wild tangents.

YeGoblynQueenne3y ago

>> In fact many propose that when you train an LLM, in order to be able to predict the next word with enough accuracy, it must internally build a world model.

If only we could find a way to map tokens to real-world entities, or to some kind of representation of ... things... outside of token space.

(yes yes, the frame problem, old as AI)

j / k navigate · click thread line to collapse