In fact many propose that when you train an LLM, in order to be able to predict the next word with enough accuracy, it must internally build a world model.
Yann Lecun is very salty about chatgpt, I wouldn't take his word seriously.
> Yann Lecun is very salty about chatgpt, I wouldn't take his word seriously. With all due respect, he's not salty at all. He's even overseen plenty of cutting edge research in the LLM space. But he rightfully has pointed out what they can and can't do.
There's too many people encountering a chatbot for the first time that sounds coherent and engaging in anthropomorphism.
Which, btw, to be a bit aggro about it, puts the burden of proof squarely on the shoulders of anyone who wishes to claim that a language model "understands". Else, one risks being hit with a small china teapot falling from space.
https://en.wikipedia.org/wiki/Russell%27s_teapot
Which might cause grave injury indeed.
You say this so confidently. But can you define in terms that are directly quantifiable what "understanding a concept" actually means?
I don't believe that anyone can (at present, anyway) although there are certainly some interesting theories and heuristics that have been put forward by various people.
Hold on there, can you "define in terms that are directly quantifiable what" 'God is real' "actually means"? If you can't, does that mean that atheists, like me, can't continue to say very confidently indeed that he doesn't?
Do I, as an atheist, need to provide proof of God's non-existence, or is it the job of people who believe in Gods to bring evidence of their existence?
And do you see the parallel here with what you are saying above? If you are saying that LLMs "understand" (you, or anyone else), why is it skeptics that have to provide evidence that they don't? You're the one who's making claims that can't be falsified.
Although I guess you have to agree with the general idea of falsifiability being better than the alternative, to see what I mean.
There is a phase of training called multi-task instruction tuning where the LLMs solve problems and thus are grounded in exact answers. That makes the difference between the difficult to handle GPT-3 from 2020 and the better behaved GPT-3 of 2022. But that dataset is small by comparison to the raw text used in pre-training, it won't do the grounding perfectly.
Real grounding comes from real feedback, even humans need the feedback or we are just going on wild tangents.
Oh, but it does build a world model. Only, its "world", is a gigantic table of token collocations, and their probabilities. So, for example, it can tell you with great accuracy that "king - man + woman = queen", but that's the only way it can map "king" to something else: by moving around its embedding space, I guess. Unfortunately, if you can only map between tokens, when you have no representation of the meaning of those tokens, other than more tokens of which you don't have any other representation, well, then, any mapping you can build won't really help you understand what those tokens mean.
If only we could find a way to map tokens to real-world entities, or to some kind of representation of ... things... outside of token space.
(yes yes, the frame problem, old as AI)