Building AGI Using Language Models (opens in new tab)

(leogao.dev)

26 pointsleogao5y ago19 comments

19 comments

15 comments · 5 top-level

hprotagonist5y ago· 7 in thread

I remain to be convinced that encoding statistical information about syntactical manipulation alone will somehow magically convert to semantic knowledge and agency if you just try really hard and do it a lot.

leogaoOP5y ago

Your doubt about more data -> better semantic knowledge about the world is well-placed and I freely admit that at this stage it's mostly conjecture, although GPT-3 does provide some evidence. However, my point is that once this hurdle is overcome, building agency on top isn't that much of a leap.

johnsimer5y ago

Language is just the shadows of reality, and studying the shadows perfectly still doesn’t give you enough info to understand reality

Cf. Plato’s Cave

mordymoop5y ago

How much have you played with GPT3?

hprotagonist5y ago

enough to know it's ELIZA on steroids. Neat party trick, and there's no there there.

3 more replies

nwienert5y ago

No agency, that requires the model manipulating the model, and no one else being able to. And if you want true agency - no kill switch. Though idk why we’d want to create anything close to true agency, seems like a pretty dumb idea imo.

gnramires5y ago

Not agency, agency is a separate system that needs to be developed, but semantic knowledge, absolutely.

The word to explain this is emergence. This indeed not quite intuitive, but neural networks exhibit many phenomena of emergence. It is tied to their ability to perform effective/efficient computation -- after all the "goal" of nature with our brain design (from which abstractions and knowledge emerges) was also effective cognition. For example, when you feed a large convolutional neural (classifier) network diverse objects, and human faces, you can verify experimentally the convolutional filters resemble "concepts", subdivisions used to assemble a larger whole (nose, eyes, mouth, etc. are the components of a face). That's the strategy of dividing and conquering, a basic aspect of efficient cognition/computation. The network has enough neurons and a good prior structure[1], that this effective architecture emerges from gradient descent training. It really is wonderful. You can see it as a primitive/rough, but powerful, form of algorithm search (or algorithm optimization). The best algorithms tend to employ abstractions.

Natural internal representations emerge.

Emotions probably don't fully emerge (in the whole breadth of emotions), although they may exist as internal representations when dealing with human bodies of work. That's because emotions are tied to our motivational system: they compose qualities (qualia) that propel us to do various activities, generally (but not always) tied to straightforward evolutionary beneficial goals: enjoying eating, craving sleep, having sex, engaging the community (mammals rely heavily on group for survival), etc.

Without agency, it's unlikely (but I can't say with certainty) those emotional qualia would emerge with accurate fidelity, simply because a non-agent model wouldn't employ those to function, wouldn't optimize for the same functions. The extent of emergence is limited to understanding and reproducing the human production, not accurately replicating its exact (internal) quality, that's derived from its computational structure and relationship with motivation. It (in this case, GPT-3) only needs to understand those human emotions insofar as predicting human behavior to a reasonable accuracy. As the corpus goes to infinity, with a sufficiently diverse expressive[2] literature, you could conjecture emergence is guaranteed[3] (just how large a corpus would we need though? Who knows). But I find it likely in practice you really need to set up the network with agency (and train it adequately to exhibit effective, motivated, behavior) before it starts reproducing well those qualities, i.e. deriving some of its understanding from practical situations that are too sparse (or absent from) the training corpus.

[1] In fully connected networks, usually a funnel or hourglass shape; in convolutional networks, a decreasing size in the 2D image domain, and increasing size across it, as if images were transforming into concepts; this structure is baked in usually (although you can do hyperparameter optimization, etc).

Finally, agency is essentially impossible to emerge (unless your training code has serious bugs, but I can't find a plausible way) from a (purely) predictive/generative neural network. There is simply no concept of itself, less so of its own goals, nowhere in its structure (only the concept of other persons/characters/things, or even some understanding of agency of other things) or training objective. Worse, it never has the opportunity to exercise this goal-oriented behavior in a comprehensive setting (again this depends on training corpus).

[2] In terms of expressing internal states, and accurately describing our actions.

[3] Then there's the question of whether the representational structure we have is a unique solution -- i.e. whether there are other ways of feeling the same feelings while acting exactly the same.

Obs: I intend to flesh this argument into an article and post here later -- I find it a quite recurrent doubt on neural network behavior.

leogaoOP5y ago

Setting aside any objections one may raise regarding the term emergence[1], the objective of this post was to discuss how while GPT-x alone cannot become an agent, its internal representations can be harnessed to create an agent. Note that human-like emotions and qualia are not in any way necessary for an artificial agent.

[1] https://www.lesswrong.com/posts/8QzZKw9WHRxjR4948/the-futili...

1 more reply

mrfusion5y ago· 2 in thread

I don’t completely follow this. Can anyone explain?

rahimnathwani5y ago

tl;dr Language models like GPT-3 include incorporate some model of the world. That's why they can generate plausible-sounding text. Future language models will be larger, more powerful, and have more complete models of the world. So we will be able to ask the language model questions, like 'what will happen if we do X?'. By compiling the answers to many such questions, we can figure out the best[0] thing to do.

[0] Assuming you have some utility function you can maximize.

leogaoOP5y ago

This is a great summary!

msamogh15y ago· 1 in thread

It talks about how you can go from a language model that simply generates text to an agent that is capable of performing actions in the real world

Essentially, the missing pieces in the picture come down to input and output modules. "How do you formulate any given problem into a form that a language model can answer?".

leogaoOP5y ago

I don't doubt the input and output modules will need some work, but in the grand scheme of things probably not that much. The big missing piece imo is just models with better world models—and it doesn't look like the scaling wars will stop anytime soon, nor does it look like size will stop helping, at least not anytime soon. (https://www.gwern.net/newsletter/2020/05#baking-the-cake)

ilaksh5y ago

To me GPT-3 excitement is equivalent to when people get hyped about "defeating aging" after seeing some resveratrol trial or something.

Language is only part of it. And you can't get complete understanding without integrating spatial information. Take a look at Josh Tenenbaum's work for explanation of why.

bionhoward5y ago

If the agent state is symbolic, that’s cool and interesting, but isn’t reality sub-symbolic?

j / k navigate · click thread line to collapse

19 comments

15 comments · 5 top-level

hprotagonist5y ago· 7 in thread

leogaoOP5y ago

johnsimer5y ago

Language is just the shadows of reality, and studying the shadows perfectly still doesn’t give you enough info to understand reality

Cf. Plato’s Cave

mordymoop5y ago

How much have you played with GPT3?

hprotagonist5y ago

enough to know it's ELIZA on steroids. Neat party trick, and there's no there there.

3 more replies

nwienert5y ago

gnramires5y ago

Not agency, agency is a separate system that needs to be developed, but semantic knowledge, absolutely.

Natural internal representations emerge.

[2] In terms of expressing internal states, and accurately describing our actions.

[3] Then there's the question of whether the representational structure we have is a unique solution -- i.e. whether there are other ways of feeling the same feelings while acting exactly the same.

Obs: I intend to flesh this argument into an article and post here later -- I find it a quite recurrent doubt on neural network behavior.

leogaoOP5y ago

[1] https://www.lesswrong.com/posts/8QzZKw9WHRxjR4948/the-futili...

1 more reply

mrfusion5y ago· 2 in thread

I don’t completely follow this. Can anyone explain?

rahimnathwani5y ago

[0] Assuming you have some utility function you can maximize.

leogaoOP5y ago

This is a great summary!

msamogh15y ago· 1 in thread

It talks about how you can go from a language model that simply generates text to an agent that is capable of performing actions in the real world

Essentially, the missing pieces in the picture come down to input and output modules. "How do you formulate any given problem into a form that a language model can answer?".

leogaoOP5y ago

ilaksh5y ago

To me GPT-3 excitement is equivalent to when people get hyped about "defeating aging" after seeing some resveratrol trial or something.

Language is only part of it. And you can't get complete understanding without integrating spatial information. Take a look at Josh Tenenbaum's work for explanation of why.

bionhoward5y ago

If the agent state is symbolic, that’s cool and interesting, but isn’t reality sub-symbolic?

j / k navigate · click thread line to collapse