(a) induce an LLM to take natural language inputs and generate statements in a probabilistic programming language that formally models concepts, objects, actions, etc. in a symbolic world model, drawing from a large body of research on symbolic AI that goes back to pre-deep-learning days; and
(b) perform inference using the generated formal statements, i.e., compute probability distributions over the space of possible world states that are consistent with and conditioned on the natural-language input to the LLM.
If this approach works at a larger scale, it represents a possible solution for grounding LLMs so they stop making stuff up -- an important unsolved problem.
The public repo is at https://github.com/gabegrand/world-models but the code necessary for replicating results has not been published yet.
The volume of interesting new research being done on LLMs continues to amaze me.
We sure live in interesting times!
---
PS. If any of the authors are around, please feel free to point out any errors in my understanding.
No, the bigger problem with current LLMs is that even with high quality factual training data, they often generate seemingly plausible nonsense (e.g. cite nonexistent websites/papers as their sources.)
This is by design imo; they’re trained to generate ‘likely’ text, and they do that extremely well. There’s no guarantee for faithful retrieval from a corpus.
It remains to be seen whether you can truly be an effective intelligence with understanding of the world if all you have are symbols that you have to manipulate.
Nevertheless, it begins with far too many hedges:
> By scaling to even larger datasets and neural networks, LLMs appeared to learn not only the structure of language, but capacities for some kinds of thinking
There's two hypotheses for how LLMs generate apparently "thought-expressing" outputs: Hyp1 -- it's sampling from similar text which is distributed so-as-to-express a thought by some agent; Hyp2 -- it has the capacity to form that thought.
It is absolutely trivial to show Hyp2 is false:
> Current LLMs can produce impressive results on a set of linguistic inputs and then fail completely on others that make trivial alterations to the same underlying domain.
Indeed: because there're no relevant prior cases to sample from in that case.
> These issues make it difficult to evaluate whether LLMs have acquired cognitive capacities such as social reasoning and theory of mind
It doesnt. It's trivial: the disproof lies one sentence above. Its just that many don't like the answer. Such capacities survive trivial permutations -- LLMs do not. So Hypothesis-2 is clearly false.
No it's not
> Current LLMs can produce impressive results on a set of linguistic inputs and then fail completely on others that make trivial alterations to the same underlying domain.
>Indeed: because there're no relevant prior cases to sample from in that case.
That's not what that tells us. Humans have weird failure modes that look absurd outside the context of evolutionary biology (some still look absurd) and that don't speak to any lack or presence of intelligence or complex thought. Not sure why it's so hard to grasp that LLMs are bound to have odd failure modes regardless of the above.
and trivial here is relative. In my experience, "trivial" often turns out to be trivial in the way a person may not pay close attention to and be similarly tricked.
For instance, GPT-4 might solve a classic puzzle correctly then fail the same puzzle subtlety changed. I've found more often than not, simply changing names of variables in the puzzle to something completely different can get it to solve the changed puzzle. It takes memory shortcuts but can be pulled out of that. LLMs have failure modes that look like human failure modes too.
Eg., do you have capacity to reason about physics? Well if you're extremely drunk, less so. But not if I permute the name of the object.
> I've found more often than not, simply changing names of variables
Yes, lol --- why do you think that is?
Because in the digitised dataset of "everything ever written" those names correspond to places in that dataset that can be sampled from by the LLM. Showing Hyp1 to be the case.
P(Hyp1| ChangeNameMakesDifference) >>>>>> P(Hyp2|ChangeNameMakesDifference)
To such a degree that the latter is vanishingly close to zero.
> look absurd outside the context of evolutionary biology
for humans, everything (everthing) is within the context of evolutionary biology!
> LLMs have failure modes that look like human failure modes too.
Yes - because LLM's are trained on 2020 Reddit.
To investigate precisely this question in a clear and unambiguous way, I trained an LLM from scratch to sort lists of numbers. It learned to sort them correctly, and the entropy is such that it's absolutely impossible that it could have done this by Hyp1 (sampling from similar text in the training set).
https://jbconsulting.substack.com/p/its-not-just-statistics-...
Now, there is room to argue that it applies a world-model when given lists of numbers with a hidden logical structure, but not when given lists of words with a hidden logical structure, but I think the ball is in your court to make that argument. (And to a transformer, it only ever sees lists of numbers anyway).
Also, Machine Learning 101: you test your models on a test set that is disjoint to the training set. To clarify, we do this not because it's in the book and that's the rules, but because, by testing the model on held-out data, we can predict the error the model will have on unseen data (i.e. data not available to the experimenter). And we do this because under PAC-Learning assumptions a learner is said to learn a concept when it can correctly label instances of the concept with some probability of some error. In real-world situations we do not know the true concept, so we test on held-out data to approximate the probability of error.
Bottom line, if you train a model to do a thing and you don't test it carefully to figure out its error, you might claim it's learned something, but in truth, you have no idea what it's learned.
(To clarify: you tested on the train data assuming there's a low probability of overlap. Don't do that if you're trying to understand what your models can do).
Formally, what hypotheses are you comparing? What do you think the specific hypothesis of the "AI = stats" person is? It isnt that the NN literally remembers data tokens, right?
In any case:
The issue with forcing NNs to model mathematical features is that the structure of the data itself has those properties. So the distributional hypothesis is true for sorting ordinals.
But it's really obviously false for natural language. The properties of the world are not the properties of word order... being red isnt "red follows words like...".
1. A thought is a representation of a situation
2. A representation generates entailments of that situation
3. Language is many-to-one translation from these representations to symbols
4. Understanding language is reversing these symbols into thoughts (ie., reprs)
So,
5. If agent A understands sentence X then A forms the relevant representation of X.
6. If agent has a representation it can state entailments of S (eg., counter-facutals).
Now, split X into Xc = "canonical descriptions of S" and trivial permutations Xp.
(st. distribution of Xc,Xp is low, but the tokens of Xp are common)
Form entailments of X, say Y -- sentences that are cannonically implied by the truth of X.
7. If the LLM understood that X entails Y, it would be via constructing the repr S -- which entails S regardless of which sentence in X was used.
8. Train an LLM on Xc and it's accuracy on judging Y entailed by Xp is random.
9. Since using Xp sentences cause it to fail, it does not predict Y via S.
QED.
And we can say,
1. Appearing to judge Y entailed-by X is possible via simple sampling of (X, Y) in historical cases. 2. LLMs are just such a sampling.
so,
3. +Inference to the best explanation:
4. LLMs sample historical cases rather than form representations.
Incidentally, "sampling of historical cases" is already something we knew -- so this entire argument is basically unnecessary. And only necessary because PhDs have been turned into start-up hype men.
How do we know? Who knows what they're trained on?
Your hypotheses 1 and 2 are not so different when you consider that the similarity function used to match text in the training data must be highly nontrivial. If it were not, then things like GPT-3 would have been possible a long time ago. As a concrete example, LLMs can do decent reasoning entirely in rot13; the relevant rot13'ed text is likely very rare in their training data. The fact that the similarity function can "see through" rot13 means that it can in principle include nontrivial computations.
There's also another hypothesis: Hyp3 -- that Hyp1 and Hyp2 converge as the LLM is scaled up (more training data, more dimensions in the latent space), and in the limit become equivalent.
But it cannot, since most of those are in the future.
Your argument also implies hyp1 and 2 are exclusive, clearly both can be true, and in fact must be true, unless you are claiming that you do not "sample" from similar language to express your own thoughts? Where does your language come from then, if not learning from previous experience?
The test for a capacity C in a system1 has nothing to do with proxy measures of that capacity in system2.
The capacity for an oven to cook food may be measured by how much smoke it lets of when burning -- but no amount of "smoke" establishes that a dry ice machine can cook.
This type of "engineering thinking" is pseudoscience.
There is intelligent thought and action, and there is unintelligent thought and action. Intelligent is that "which checked" (intus-legere); the other, the """impulsive""", is not.
> How could the common-sense background knowledge needed for dynamic world model synthesis be represented, even in principle? Modern game engines may provide important clues.
This has often been my starting point in modelling the difference between a model-of-pixels vs. a world model. Any given video game session can be "replayed" by a model of its pixels: but you cannot play the game with such a model. It does not represent the causal laws of the game.
Even if you had all possible games you could not resolve between player-caused and world-caused frames.
> A key question is how to model this capability. How do minds craft bespoke world models on the fly, drawing in just enough of our knowledge about the world to answer the questions of interest?
This requires a body: the relevant information missing is causal, and the body resolves P(A|B) and P(A|B->A) by making bodily actions interpreted as necessarily causal.
In the case of video games, since we hold the controller, we resolve P(EnemyDead|EnemyHit) vs. P(EnemyDead| (ButtonPress ->) EnemyHit -> EnemyDead)
"The vast majority of our knowledge, skills, and thoughts are not verbalizable. That's one reason machines will never acquire common sense solely by reading text."
Now sure you can't describe qualia, but that's basically a subjective artefact of how we sense the world and (to add another unfounded hot take) likely not critical to have an understanding of it on a physical level.
It's a topic that's too large for an HM comment, but "explaining" things in words comes after the fact, and mostly limited to a small subset of our experience and skillset that is amenable to it.
Note that humans are animals too, btw. And conversely, I would consider nonverbal people as humans as well.
I would wager if you put a newborn human to be raised in the absence of any physical human contact, but somehow taught them to read/write, and gave them access to a universal corpus (text only, no audio/video), or heck, even internet access with `curl`, and lastly dropped them into the "real world" at age 25, they would be utterly incapable of performing, say, a basic service job at a restaurant.
Words help us symbolize and reason about our sense experiences, but they are not a substitute for them.
This is either some profound miscomprehension of just how many of your skills and thoughts are inexpressible in words, or some statement of how profoundly shallow your skills and thoughts actually are.
Of course, that does leave the door Open, that when these models are put in a physical real body, a robot, and have to interact with the world, then maybe they can gain that "common sense".
This doesn't mean a silicon based AI can't become conscious of skills that are hard to verbalize. Just that they don't yet have all the same inputs that we have. And when they do, and they have internal thoughts, they will have the same difficulty verbalizing them that we do.
His research at meta is in the analytic approach to machine learning. As result he is very unabashed in expressing distaste of ML approaches that don't align with his research.
Really, there is no larger sore loser than LeCun in internalizing the bitter lesson. Quoting him without this context is being deliberately misleading.
I would like to explain, but I can't quite put it into words...
:)
They keep saying LLMs but only GPT-4 can do it at that level. Although actually some of the examples were pretty basic so I guess it really depends on the level of complexity.
I feel like this could be really useful in cases where you want some kind of auditable and machine interpretable rationale for doing something. Such as self driving cars or military applications. Or maybe some robots. It could make it feasible to add a layer of hard rules in a way.
Reasoning is just prediction with memory towards an objective.
Once large models have these perpetual operating sensory loops with objective functions, the ability to distinguish model powered intelligence and human like intelligence tends to drop.
World models are meant to be for simulating environments. If this was something like testing if a game agent with llm can form thoughts as it play through some game it would be very interesting. Maybe someone on HN can do this?
You need constant modeling of touch/smell/vision/temperature, etc.
These senses give us an actual understanding of the physical world and drive our behavior in a way that pure language will never be able to.
"sufficient equivalence" is important because sure it may not _really_ know the color of red or the qualia of being, but if for all intents and purposes the LLM's internal model provides predictive power and answers correctly as if it does have a world model, then what is the difference?
Paper : Hi! I am 94 pages long.
I : omg...