undefined | Better HN

0 pointsabeppu3y ago0 comments

Defining what "knowing" is would be useful, yes, and analytic philosophers in epistemology do argue about this. One attribute that's classically part of the definition of "knowing" is that the thing which is known must be true. LLMs are pretty bad at this, but perhaps that can be fixed.

But I would challenge you to imagine the situation the LLM is actually in. Do you understand Thai? If so, in the following, feel free to imagine some other language which you don't know and is not closely related to any languages you do know. Suppose I gather reams and reams of Thai text, without images, without context. Books without their covers, or anything which would indicate genre. There's no Thai-English dictionary available, or any Thai speakers. You aren't taught which symbols map to which sounds. You're on your own with a giant pile of text, and asked to learn to predict symbols. If you had sufficient opportunity to study this pile of text, you'd begin to pick out patterns of which words appear together, and what order words often appear in. Suppose you study this giant stack of Thai text for years in isolation. After all this study, you're good enough that given a few written Thai words, you can write sequences of words that are likely to follow, given what you know of these patterns. You can fill in blanks. But should anyone guess that you "know" what you're saying? Nothing has ever indicated to you what any of these words _mean_. If you give back a sequence of words, which a Thai speakers understands to be expressing an opinion about monetary policy, because you read several similar sequences in the pile, is that even your opinion?

I think algorithms can 'know' something, given sufficient grounding. LLMs 'know' what text looks like. They can 'know' what tokens belong where, even if they don't know anything about the things referred to. That's all, because that's what they have to learn from. I think an game-playing RL-trained agent can 'know' the likely state-change that a given action will cause. An image segmentation model can 'know' which value-differences in adjacent pixels are segment boundaries.

But if we want AIs that 'know' the same things we know, then we have to build them to perceive in a multi-modal way, and interact with stuff in the world, rather than just self-supervising on piles of internet data.

0 comments

43 comments · 11 top-level

space_fountain3y ago· 6 in thread

Let me respond with an analogy of my own. Imagine you are a scientist on an alien world. The aliens primary experience the world through magnetic fields. They live deep in the atmosphere of a hot Jupiter like planet and rarely touch anything and have no eyes. Still they are intelligent beings and so quickly they are able to establish communication with you. A computer translates and you both have to become a bit more familiar with each other's modes of perceiving the world. You could write a whole novel explaining this sort of difference in modes of perception, but my question is if you, the human, can learn to understand what it is to perceive magnetic fields? I think obviously the answer is yes. In fact, if you are to communicate you'll have to. I think the sort of modal/sense difference your analogy plays on is similar because I think for a human to get good at responding you'd have to start knowing things about the symbols. That knowledge obviously wouldn't be grounded in a way that you could translate it back into English. But you might for example learn that one word is a type of another or even that some words describe entities that are then referenced later and to actually get good at it, which it's not at all clear a human could, even that some entities have hidden state

This feels related to the idea of the Chinese room. There I think the resolution is that the human following instructions does not understand Chinese but the room, the system of instructions + the human to follow them does. In a similar way obviously an individual neuron doesn't understand anything but brains do.

I guess it just feels like this general argument, that merely seeing things and making predictions that turn out to be right isn't enough to understand it will never go away. We could have a full fledged robot walking around having conversations and I could dispute its ability to really understand. It's just learned to imitate other humans I'd say. It doesn't really know anything, it's just following a statistical model to decide how to move an arm

goatlover3y ago

> but my question is if you, the human, can learn to understand what it is to perceive magnetic fields? I think obviously the answer is yes.

I think it's obviously no, because we don't have sensations of magnetic fields. It's the question of what it's like to be a bat raised by Thomas Nagel. The aliens can give us their words for conscious magnetic sensations which we can learn to use, but we won't experience them. We're basically p-zombies when it comes to non-human experiences.

> There I think the resolution is that the human following instructions does not understand Chinese but the room, the system of instructions + the human to follow them does. In a similar way obviously an individual neuron doesn't understand anything but brains do.

Searle's response to the systems objection is that we already know that brains understand Chinese. But we don't know this for the room. I would further say that brains alone don't understand anything, humans understand things as language users embedded in a social and physical world. One can invoke Wittgenstein and language games here.

Leo_Germond3y ago

I agree with you. I really enjoy this idea that understanding, conscience, are emerging properties of a system, which does not need to limit itself to any scope to happen. In that light the current approach most people take on this, taking an arbitary selection of parts to see if it exhibits those properties, is not right at all.

A ion channel does not have even a tiny spec of conscience, no matter how you organize them, but our brain does indeed need those to be conscient (and incidentally it relies on a whole lot more "stupid" parts than that: try being conscient without oxygen, or glucose).

I would go as far as making conscience an emergent property of interaction with the environment: what does it mean to be conscious if nothing is there to confirm that you are indeed of a singular conscience? Is it possible to understand the concept of self if you have no concept of other beings?

abeppuOP3y ago

> my question is if you, the human, can learn to understand what it is to perceive magnetic fields? I think obviously the answer is yes.

I certainly don't see that as obvious, and I would guess that while you can learn _about_ their perceptual mode, you can't learn what it is like to perceive magnetic fields just through talking about it. I would consider the Mary's Room thought experiment, and the What Is It Like To Be a Bat paper from Nagel.

I think there's a relationship to the Chinese Room, but I want to be clear. In the original formulation, the person in the room follows a book of pre-provided instructions to produce a response. The LLM and person in the Thai text completion scenario must learn an equivalent set of instructions themselves, and for this I would claim that they are comparable to the human + book combination in the original Chinese Room. The person who learns to complete Thai text doesn't know what they're talking about, but they know more than the person following instructions in the Chinese Room. But clearly they still don't know what a Thai speaker knows.

> I guess it just feels like this general argument, that merely seeing things and making predictions that turn out to be right isn't enough to understand it will never go away. We could have a full fledged robot walking around having conversations and I could dispute its ability to really understand.

No, perhaps the end of my original statement didn't make this clear, but I think AI systems _can_ know things, and knowing is not a binary but part of a range. StabilityAI / DALL-e know quite a bit about the relationship between texts and images, and the structure within images -- but they _don't_ know about bodies, physical reality, etc etc. A system that has multiple modalities of perception, learns to physically navigate the world, interact with objects, make and execute plans by understanding the likely effects of actions, etc -- knows and understands a lot. I'm not arguing about a hard limitation of AI; I'm arguing about a limitation of the way our current AIs are built and trained.

TeMPOraL3y ago

My intuition is that the difference between GP's analogy and the Chinese room is in computing power of the system, in the sense of Chomsky hierarchy[0] (as opposed to instructions per second).

In the Chinese room, the instructions you're given to manipulate symbols could be Turing-complete programs, and thus capable of processing arbitrary models of reality without you knowing about them. I have no problem accepting the "entire room" as a system understands Chinese.

In contrast, in GP's example, you're learning statistical patterns in Thai corpus. You'll end up building some mental models of your own just to simplify things[1], but I doubt they'll "carve reality at the joints" - you'll overfit the patterns that reflect regularities of Thai society living and going about its business. This may be enough to bluff your way through average conversation (much like ChatGPT does this successfully today), but you'll fail whenever the task requires you to use the kind of computational model your interlocutor uses.

Math and logic - the very tasks ChatGPT fails spectacularly at - are prime examples. Correctly understanding the language requires you to be able to interpret the text like "two plus two equals" as a specific instance of "<number> <binary-operator> <number>"[2], and then execute it using learned abstract rules. This kind of factoring is closer to what we mean by understanding: you don't rely on surface-level token patterns, but match against higher-level concepts and models - Turing-complete programs - and factor the tokens accordingly.

Then again, Chinese room relies on the Chinese-understanding program to be handed to you by some deity, while GP's example talks about building that program organically. The former is useful philosophically, the latter is something we can and do attempt in practice.

To complicate it further, I imagine the person in GP's example could learn the correct higher-level models given enough data, because at the center of it sits a modern, educated human being, capable of generating complex hypotheses[3]. Large Language Models, to my understanding, are not capable of it. They're not designed for it, and I'm not sure if we know a way to approach the problem correctly[4]. LLMs as a class may be Turing-complete, but any particular instance likely isn't.

In the end, it's all getting into fuzzy and uncertain territory for me, because we're hitting the "how the algorithm feels from inside" problem here[5] - the things I consider important to understanding may just be statistical artifacts. And long before LLMs became a thing, I realized that both my internal monologue and the way I talk (and how others seem to speak) is best described as a Markov chain producing strings of thoughts/words that are then quickly evaluated and either discarded or allowed to be grown further.

[0] - https://en.wikipedia.org/wiki/Chomsky_hierarchy

[1] - On that note, I have a somewhat strong intuitive belief that learning and compression are fundamentally the same thing.

[2] - I'm simplifying a bit for the sake of example, but then again, generalizing too much won't be helpful, because most people only have procedural understanding of few most common mathematical objects, such as real numbers and addition, instead of a more theoretical understanding of algebra.

[3] - And, of course, exploit the fact that human languages and human societies are very similar to each other.

[4] - Though taking a code-generating LLM and looping it on itself, in order to iteratively self-improve, sounds like a potential starting point. It's effectively genetic programming, but with a twist that your starting point is a large model that already embeds some implicit understanding of reality, by virtue of being trained on text produced by people.

[5] - https://www.lesswrong.com/posts/yA4gF5KrboK2m2Xu7/how-an-alg...

abeppuOP3y ago

> I have no problem accepting the "entire room" as a system understands Chinese.

> you'll fail whenever the task requires you to use the kind of computational model your interlocutor uses.

I think it's important to distinguish between knowing the language and knowing anything about the stuff being discussed in the language. The top level comment all this is under mentioned knowing what a bag is or what popcorn is. These don't require computational complexity, but do require some other data than just text, and a model that can relate multiple kinds of input.

shawnz3y ago

To be clear, transformer networks are turing-complete: https://arxiv.org/abs/2006.09286

didntreadarticl3y ago· 5 in thread

Ok I like your thought experiment. Lets change it a bit.

Instead of it being an unknown language, its English (a language you know), but every single Noun, Verb, Adjective or Preposition has been changed to Thai (a language you dont know).

The Mæw Nạ̀ng Bn the S̄eụ̄̀x.

If you had sufficient opportunity to study this pile of text, you'd begin to pick out patterns of which words appear together, and what order words often appear in. Suppose you study this giant stack of Thai text for years in isolation. After all this study, you're good enough that given a few written Thai words, you can write sequences of words that are likely to follow, given what you know of these patterns.

Right, and to get good at this task, you'd need to build models in your head. You would think to yourself, right a Mæw tends to nạ̀ng bn a S̄eụ̄̀x, and you would build up a model of the sort of things a Mæw might do, the situations it might be in. In an abstract way. As you absorbed more and more data you would adjust these abstract models to fit the evidence you had.

You dont know what a Mæw is. But if someone asks you about a Mæw, you can talk about how it relates to S̄eụ̄̀x, Plā and H̄nū. You know stuff about Mæw, but its abstract.

abeppuOP3y ago

If you're constructing this to rely on my prior knowledge both of the world and of English, then I must remind you that those are things the LLM does not have. We have to be careful to not allow our human inferential biases from distorting our thinking about that the models are doing.

polishdude203y ago

Yeah but if you ask the model what a cat is, it'll use other words that describe a cat because they're usually used in a sentence about cats. These words must relate to cats. So if I ask you what a cat is, you'll use words that relate to cats. Sure, you may visually see these words in your head. You may visually see a cat in your head, but your output to me is just a description of a cat. That's the same thing the network would do.

2 more replies

didntreadarticl3y ago

I know that. its a metaphor to adjust the 'Thai Language' intuition-pump that was presented. I'm making it easier to imagine how a Large Language Model might make a Model of the Language

hzay3y ago

I love this theory. You're saying that a distinction can be drawn between our linguistic concepts and our lived experience, and that the former can be learned without the latter. And that a model could operate upon those linguistic concepts in a useful way, but without the benefit (or drawback?) of the mappings we keep between language and experience. And that it can learn this based on the large amount of texts we have.

Fascinating, and seems like a plausible description of what's going on.

schrodinger3y ago

Basically the "rosetta stone" theory, right?

tialaramex3y ago· 5 in thread

Generally I don't buy these arguments which require embodiment, because they don't seem to align well to what else I know about my world.

Rather than your Thai text example, let's consider a friend of my sister H. H has been profoundly blind from birth. Not "legally blind" with the world a blur, her eyes actually don't work. Direct lived experience of a summer day is to her literally just feeling warmth on her face from the sun, her eyes can't see the visible light.

I've seen purple and H never will so it seems to me you're arguing I "know" what purple is and she doesn't, thus ChatGPT doesn't know what purple is either. But I don't think I agree, I think we're both just experiencing a tiny fraction of reality, and ChatGPT is experiencing an even narrower sliver than either of us and that it probably wouldn't do us any good to try to quantify it. If I "know what purple is" then so does H and perhaps ChatGPT or a successor model will too.

TheOtherHobbes3y ago

That's an argument from ignorance, and it's not credible. The potential total scope of experience is irrelevant. The reality is that you have an embodied experience of purple shared with most humans. Unfortunately your sister doesn't. She will have a linguistic placeholder for the concept of purple, probably surrounded by verbal associations. But that's all.

It's an ironically apt analogy, because ChatGPT has the linguistic understanding of an entity that is deaf, dumb, blind, and has no working senses of any kind, and instead relies on a golem-like automated mass of statistics with some query processing.

We tend to project intelligence onto linguistic ability, because it's a useful default assumption in our world. (If you've ever tried speaking a foreign language while not being very good at it, you'll know how the opposite feels. Humans assume that not being able to use language is evidence of low intelligence.)

But it's a very subjective and flawed assessment. Embodied experience is far more necessary for sentience than we assume, and apparent linguistic performance is far less.

pixl973y ago

There's a few particular problems we have with the word intelligence/sentience, mostly revolving around that we evolved embodiment first and then added more and more complex intelligence/sentience on top of an ever changing DNA structure.

Much like when humans started experimenting with flight we tried to make flapping things like birds, but in the end it turns out spinning blades gives us capabilities above and beyond bodies that flap.

Back to the embodiment problem. For us as humans we have limits like only having one body. It has a great number of sensors but they are still very limited in relation to what reality has to offer, hence we extend our senses with technology. And with that there is no reason machine intelligence embodiment has to look anything like ours. Machine intelligence could have trillions of sensors spread across the planet as an example.

tialaramex3y ago

> Unfortunately your sister doesn't.

My sister isn't blind. H isn't my sister, she's a friend of my sister as I wrote.

Do you have concrete justification for your insistence that "embodied experience is far more necessary" ?

abeppuOP3y ago

I don't think embodiment is required to understand a lot of stuff. But language is how we talk about the world, and non-linguistic concepts have to be grounded in an exposure to something other than language. I think there's an argument to be made that DALLe "knows" more about a lot of words than a pure language model bc it can relate phases to visual concepts. But I do think for many concepts, understanding also proceeds from interaction. This doesn't necessarily need to be physical. I similarly think code generation tools need access to interpreters etc to "understand" the code they're generating. Embodiment is not relevant to all concepts.

TeMPOraL3y ago

I don't think the argument about DALLe would work - it deals with pixels instead of words, but it's fundamentally a different form of language, made of different mathematical patterns (obscured to us because, unlike symbolic manipulation, our visual system handles high-level patterns in images without engaging our conscious awareness).

I do agree about grounding is needed. All our language is expressing or abstracting concepts related to how we perceive and interact with reality in continuous space and time. This perception and interaction is a huge correlating factor that our ML models don't have access to - and we're expecting them to somehow tease it out from a massive dump of weakly related snapshots of recycled high-level human artifacts, be they textual or visual. No surprise the models would rather latch onto any kind of statistical regularity in the data, and get stuck in a local minimum.

Now I don't believe solution is actual embodiment - that would be constraining the model too hard. But I do think the model needs to be exposed to the concepts of time and causality - which means it needs to be able to interact with the thing it's learning about, and feed the results back into itself, accumulating them over time.

s0ulphire3y ago· 4 in thread

A follow up question: as a human doesn't start with "knowing" something either and first creates definitions for objects or words, which it then uses to build increasingly abstract concepts that we eventually classify as "knowledge" on the thing, is there anything that would stop LLMs from being able to do the same thing? I fully agree the capability is not there yet, but I can't say what would stop an appropriately designed model from being able to do so myself.

abeppuOP3y ago

A human hears words in context. Those words tie to things in the environment, responses to the young human's actions, etc. A parent saying, "roll the ball" during playtime with their kid and actually pushing a ball back and forth, provides a grounding of words in actual experience.

> is there anything that would stop LLMs from being able to do the same thing?

If you built an AI system which could hear/see/touch/move etc, and it learned language and vision and behaviors together, such that it knows that a ball is round, can be thrown or rolled, is often used at playtime, etc, then maybe it could understand rather than just produce language. I don't know that we would still call it an LLM, because it could likely do many other things too.

lurquer3y ago

Socrates argued that we are born knowing everything, but we forgot most if it. Learning is simply the act of recalling what you once knew.

The point, for this thread, is not whether or not Socrates was correct.

Rather, it’s a warning that we must not confidently assume we are anything like a machine.

We may have souls, we may be eternal, there may be something utterly immaterial at the heart of us.

As we strive to understand the inner-workings of machines that appear, at times, to be human-like, we ought not succumb to the temptation to think of ourselves as machine-like merely in order to convince ourselves (incorrectly) that we understand what’s going on.

s0ulphire3y ago

We may indeed have souls or be eternal; although I call myself atheist, I don't agree with subscribing with 100% certainty to any idea. As CosmicSkeptic points out everyone holds bad ideas without knowing it, and unless you're open to questioning them you'll never find out.

With that said, there is quite literally zero evidence for the existence of a soul, despite it being posited for thousands of years, and increasing evidence that consciousness is simply a product of a sufficiently connected system. I'll draw an analogy to temperature, which isn't "created", but is a simple consequence of two points in space having different energy levels. I'm sure there's a better analogy that could be made, but I think you get the idea.

int_19h3y ago

And, conversely, we might just be so full of ourselves that we are willing resort to claims on the immaterial if that's what it takes to not give up the exceptionalism.

ghayes3y ago· 3 in thread

Personally, I'm not convinced, in your hypothetical, that the participant does not "know" Thai at that point. Seeing a young child learn language, it's a lot more adaptive than I think we tend to see language learning, as we often think about learning language as a teenager and not a toddler. I agree the machine does not know what a pizza tastes like nor does it know what it is to _want_ pizza, but I'm not sure that is what is being contested here.

abeppuOP3y ago

Maybe the participant "knows" something about the Thai language? But that's different from knowing anything about the things being discussed. The jumping off point for this, which motivated a question about what it is to know, was the comment:

> What this article is not showing (but either irresponsibly or naively suggests) is that the LLM knows what a bag is, what a person is, what popcorn and chocolate are, and can then put itself in the shoes of someone experiencing this situation, and finally communicate its own theory of what is going on in that person's mind. That is just not in evidence.

Knowing something about the patterns of word order in Thai is not the same as knowing about the world being discussed in Thai.

spacemadness3y ago

Language doesn’t come first for humans. Experiencing the world does. Languages then become symbols to communicate experiencing the world through our senses and emotional/mental states. I’m not sure why people get hung up on language models not being the same thing when they start and end with language.

1 more reply

catach3y ago

> I agree the machine does not know what a pizza tastes like nor does it know what it is to _want_ pizza

It also does not "know" that a pizza is an object in a world, because none of the words its working with are attached to any experience or concepts.

skjoldr3y ago· 3 in thread

>But if we want AIs that 'know' the same things we know, then we have to build them to perceive in a multi-modal way, and interact with stuff in the world, rather than just self-supervising on piles of internet data.

In other words, a LLM that is tied to a GAN that generates images, produces an system that can both describe to you what is a cat verbally and show you a picture of a cat. Does it, then, know what "a cat" is?

Edit: Furthermore, if you then tie this AI to a CV model with a camera which you can point at a cat and it will tell you that it is, indeed, a cat, and then it will also be able to produce a verbal description of a cat as well as show you an abstract picture of a cat or pick cats out of a random set of images, does this whole system know what "a cat" is?

If you, then, make a robot with a camera and hands, attach to the system a more complex CV model that can see in 3D, ask the LLM to produce you a set of code instructions that can be parametrized to produce a motion that would pet the cat, input those instructions into the robot to make it pet a specific cat that has the specific 3D point cloud (I guess that's currently difficult but solveable), and the system will then indeed pet the cat, would it then know what "a cat" is?..

The underlying LLM is still the same in all these scenarios. Where is the boundary?

markmaglana3y ago

At some point, when multiple components (including the LLM) have been connected to form a system that exhibits "knowing" (the way humans do), wouldn't the "intelligence" be distributed across the entire system rather than attributed primarily to the LLM?

In other words, the LLM wouldn't be the equivalent of the human brain. Instead, it would just be equivalent to that part of the human brain that processes language.

grugagag3y ago

Interesting point that seems quite valid to me. We use different modes of thinking from analytical to emotional, verbal to nonverbal, reactionary, etc. It is possible that LLMs are the key to one of brains modules responsible for producing/processing language but it does not involve or has any knowledge of the other modules necessary for getting closer to human intelligence.

abeppuOP3y ago

> The underlying LLM is still the same in all these scenarios. Where is the boundary?

No, it's not the same LLM; you'd have to change the LLM in all of those cases. How does it receive input from the GAN? The typical LLM is constructed to literally receive a sequence of encoded tokens. There are vision transformers, and they do chunk images into tokens, and there are multimodal transformers, but none of these are fairly described as an LLM, and they're structurally different than something like ChatGPT. And after the structural changes, it would need to be trained on some new data that associates text sequences and image sequences, and after being optimized in that context you have a _different model_.

Does being able to identify images of cats mean the model knows what a cat is? No, and we could have said that a decade ago when deep learning for image classification was making its early first advances. Does being able to describe a cat from video mean you know what the cat is? Probably not, but maybe we're getting closer. Does knowing how to pet a cat mean you know what a cat is? Perhaps not if you need to be instructed to try to pet the cat.

But suppose 10 years from now, I have a domestic robot that has a vision system, and a motor control system, and an ability to plan actions and interact with a rich environment. I would say the following would be strong evidence of knowing what a cat is:

- it can not only identify or locate the cat, but can label parts of the cat, despite the cat having inconsistent shape. It can consistently pick up the cat in a way which is sensitive and considerate of the cat's anatomy (e.g. not by the head, by one paw, etc)

- it can entertain the cat, e.g. with a laser pointer, and can infer whether the cat is engaged, playful, stressed, angry etc

- it avoids placing fragile object near high edges, because it can anticipate that the cat is likely to knock them down, even if the cat is not currently near

- it can anticipate the cat's behavior and adjust plans around it; e.g. avoid vacuuming the sunny spot by the window in the afternoon when the cat is likely to be napping there

- it can anticipate the cat's reactions to stimuli, such as loud noises, a can of food opening, etc, and can incorporate these considerations into plans

Note, _none_ of the above have anything to do with language. If I add to the robot a bunch of NLP systems to hear and understand commands or describe its actions or perceptions, it may now know that a cat is called "cat", and how to talk about a cat, but these are distinct from knowing what a cat is.

Similarly,

- a human with some serious aphasia may be unable to describe the cat, but they can clearly still know what a cat is

- a dog can know what a cat is, in many important ways, despite having no language abilities

LarryMullins3y ago· 3 in thread

Your Thai text generator example seems like a reformulation of the "Chinese Room" thought experiment, except you're running the system using a single human brain instead of many. I'm not sure that makes a difference. The human running the system doesn't understand Thai, but perhaps that system itself does.

abeppuOP3y ago

I agree that the system of OpenAI, ChatGPT, and a user entering text on their website taken together may contain knowledge of "what a bag is, what a person is, what popcorn and chocolate are", etc. I do not agree that the LLM on its own "knows" what any of those things are.

LarryMullins3y ago

Seems like that's a consequence of the philosophical semantics of the word "know", not really a statement about the demonstrable capabilities of the LLM. In other words, why does it matter?

1 more reply

goatlover3y ago

The system understands how to produce Thai text, but it doesn't understand the references of various Thai words to the world, emotional and mental states, social interactions, etc.

Nevermark3y ago· 2 in thread

People learn enormous amounts of things that we don’t actually “understand” in any deep way

As long as our minds pops out appropriate thoughts for the given context we don’t even think about the magic machinery behind the scenes that did that.

When queried about our thinking we are mostly creating a plausible story, not actually examining our own thinking.

Also, blind people can talk sensibly about many visual phenomena, having learned about them through language

I think the new LLM are giving us all so many wow’s, because “understanding” is the only kind of compression that actually works at the scale of the training data

I.e. representations are being created that reflect the actual functional, as well as associative or correlative, relations between concepts.

goatlover3y ago

Blind people still have bodies and other sensory perceptions to relate visual meaning to. Temple Grandin is a high functioning autist who describes how visual thinkers translate words into pictures, because they think pictorially. LLMs don't have any embodied, grounded contact with the world, so their only understanding can be statistical/symbolic pattern matching of text. Which isn't how language works for humans, since we use words for our experiences as social animals moving about and manipulating the world with our bodies.

Nevermark3y ago

Good points

But blind people can talk about color intelligently too, if not as completely as a sighted person. Despite not experiencing color qualia.

thaumasiotes3y ago· 1 in thread

> Suppose you study this giant stack of Thai text for years in isolation. After all this study, you're good enough that given a few written Thai words, you can write sequences of words that are likely to follow, given what you know of these patterns. You can fill in blanks. But should anyone guess that you "know" what you're saying? Nothing has ever indicated to you what any of these words _mean_. If you give back a sequence of words, which a Thai speakers understands to be expressing an opinion about monetary policy, because you read several similar sequences in the pile, is that even your opinion?

Note that this isn't just an exotic thought experiment. People like this already exist; the condition is known as "Wernicke's aphasia". People displaying this condition can speak normally. They can't understand things; they are missing a normal mental mapping from words to meanings.

abeppuOP3y ago

> People displaying this condition can speak normally.

Not really? They can speak in grammatically correct sentences, with connected speech, but what they say can be nonsense. I wouldn't call that normal. I think LLMs show that, solely with access to text, it's possible to produce a good enough model that what you produce is not only not nonsense, but so good that academic psychologists suggest it may have a theory of mind.

> However, often what they say doesn’t make a lot of sense or they pepper their sentences with non-existent or irrelevant words.

https://www.aphasia.org/aphasia-resources/wernickes-aphasia/

crancher3y ago

Excellent and useful analogy. Thank you.

abeppuOP3y ago

I only just realized I should have described this using English, but you only see the token ids emitted by an encoder. You can't read its source, and you never get to invoke it on your own inputs.

j / k navigate · click thread line to collapse

0 comments

43 comments · 11 top-level

space_fountain3y ago· 6 in thread

goatlover3y ago

> but my question is if you, the human, can learn to understand what it is to perceive magnetic fields? I think obviously the answer is yes.

Leo_Germond3y ago

abeppuOP3y ago

> my question is if you, the human, can learn to understand what it is to perceive magnetic fields? I think obviously the answer is yes.

TeMPOraL3y ago

My intuition is that the difference between GP's analogy and the Chinese room is in computing power of the system, in the sense of Chomsky hierarchy[0] (as opposed to instructions per second).

[0] - https://en.wikipedia.org/wiki/Chomsky_hierarchy

[1] - On that note, I have a somewhat strong intuitive belief that learning and compression are fundamentally the same thing.

[3] - And, of course, exploit the fact that human languages and human societies are very similar to each other.

[5] - https://www.lesswrong.com/posts/yA4gF5KrboK2m2Xu7/how-an-alg...

abeppuOP3y ago

> I have no problem accepting the "entire room" as a system understands Chinese.

> you'll fail whenever the task requires you to use the kind of computational model your interlocutor uses.

shawnz3y ago

To be clear, transformer networks are turing-complete: https://arxiv.org/abs/2006.09286

didntreadarticl3y ago· 5 in thread

Ok I like your thought experiment. Lets change it a bit.

Instead of it being an unknown language, its English (a language you know), but every single Noun, Verb, Adjective or Preposition has been changed to Thai (a language you dont know).

The Mæw Nạ̀ng Bn the S̄eụ̄̀x.

You dont know what a Mæw is. But if someone asks you about a Mæw, you can talk about how it relates to S̄eụ̄̀x, Plā and H̄nū. You know stuff about Mæw, but its abstract.

abeppuOP3y ago

polishdude203y ago

2 more replies

didntreadarticl3y ago

I know that. its a metaphor to adjust the 'Thai Language' intuition-pump that was presented. I'm making it easier to imagine how a Large Language Model might make a Model of the Language

hzay3y ago

Fascinating, and seems like a plausible description of what's going on.

schrodinger3y ago

Basically the "rosetta stone" theory, right?

tialaramex3y ago· 5 in thread

Generally I don't buy these arguments which require embodiment, because they don't seem to align well to what else I know about my world.

TheOtherHobbes3y ago

But it's a very subjective and flawed assessment. Embodied experience is far more necessary for sentience than we assume, and apparent linguistic performance is far less.

pixl973y ago

tialaramex3y ago

> Unfortunately your sister doesn't.

My sister isn't blind. H isn't my sister, she's a friend of my sister as I wrote.

Do you have concrete justification for your insistence that "embodied experience is far more necessary" ?

abeppuOP3y ago

TeMPOraL3y ago

s0ulphire3y ago· 4 in thread

abeppuOP3y ago

> is there anything that would stop LLMs from being able to do the same thing?

lurquer3y ago

Socrates argued that we are born knowing everything, but we forgot most if it. Learning is simply the act of recalling what you once knew.

The point, for this thread, is not whether or not Socrates was correct.

Rather, it’s a warning that we must not confidently assume we are anything like a machine.

We may have souls, we may be eternal, there may be something utterly immaterial at the heart of us.

s0ulphire3y ago

int_19h3y ago

And, conversely, we might just be so full of ourselves that we are willing resort to claims on the immaterial if that's what it takes to not give up the exceptionalism.

ghayes3y ago· 3 in thread

abeppuOP3y ago

Knowing something about the patterns of word order in Thai is not the same as knowing about the world being discussed in Thai.

spacemadness3y ago

1 more reply

catach3y ago

> I agree the machine does not know what a pizza tastes like nor does it know what it is to _want_ pizza

It also does not "know" that a pizza is an object in a world, because none of the words its working with are attached to any experience or concepts.

skjoldr3y ago· 3 in thread

The underlying LLM is still the same in all these scenarios. Where is the boundary?

markmaglana3y ago

In other words, the LLM wouldn't be the equivalent of the human brain. Instead, it would just be equivalent to that part of the human brain that processes language.

grugagag3y ago

abeppuOP3y ago

> The underlying LLM is still the same in all these scenarios. Where is the boundary?

- it can entertain the cat, e.g. with a laser pointer, and can infer whether the cat is engaged, playful, stressed, angry etc

- it avoids placing fragile object near high edges, because it can anticipate that the cat is likely to knock them down, even if the cat is not currently near

- it can anticipate the cat's behavior and adjust plans around it; e.g. avoid vacuuming the sunny spot by the window in the afternoon when the cat is likely to be napping there

- it can anticipate the cat's reactions to stimuli, such as loud noises, a can of food opening, etc, and can incorporate these considerations into plans

Similarly,

- a human with some serious aphasia may be unable to describe the cat, but they can clearly still know what a cat is

- a dog can know what a cat is, in many important ways, despite having no language abilities

LarryMullins3y ago· 3 in thread

abeppuOP3y ago

LarryMullins3y ago

Seems like that's a consequence of the philosophical semantics of the word "know", not really a statement about the demonstrable capabilities of the LLM. In other words, why does it matter?

1 more reply

goatlover3y ago

The system understands how to produce Thai text, but it doesn't understand the references of various Thai words to the world, emotional and mental states, social interactions, etc.

Nevermark3y ago· 2 in thread

People learn enormous amounts of things that we don’t actually “understand” in any deep way

As long as our minds pops out appropriate thoughts for the given context we don’t even think about the magic machinery behind the scenes that did that.

When queried about our thinking we are mostly creating a plausible story, not actually examining our own thinking.

Also, blind people can talk sensibly about many visual phenomena, having learned about them through language

I think the new LLM are giving us all so many wow’s, because “understanding” is the only kind of compression that actually works at the scale of the training data

I.e. representations are being created that reflect the actual functional, as well as associative or correlative, relations between concepts.

goatlover3y ago

Nevermark3y ago

Good points

But blind people can talk about color intelligently too, if not as completely as a sighted person. Despite not experiencing color qualia.

thaumasiotes3y ago· 1 in thread

abeppuOP3y ago

> People displaying this condition can speak normally.

> However, often what they say doesn’t make a lot of sense or they pepper their sentences with non-existent or irrelevant words.

https://www.aphasia.org/aphasia-resources/wernickes-aphasia/

crancher3y ago

Excellent and useful analogy. Thank you.

abeppuOP3y ago

I only just realized I should have described this using English, but you only see the token ids emitted by an encoder. You can't read its source, and you never get to invoke it on your own inputs.

j / k navigate · click thread line to collapse