Large language models lack deep insights or a theory of mind (opens in new tab)

(arxiv.org)

277 pointsmnode2y ago261 comments

261 comments

I think that if they would, that would be very surprising and indicative of a lot of wastefulness inside the model architecture. All these tests are simple single prompt experiments, so the LLM's get no chance to reason about their responses. They're just system 1 thinking, the equivalent of putting a gun to someone's head and asking them to solve a large division in 2 seconds.

I bet a lot of these experiments would already solvable by putting the LLM in a simple loop with some helper prompts that make it restructure and validate its answers, form theories and get to explore multiple lines of thought.

If an LLM would be able to do that in a single prompt, without a loop (so the LLM always answers in a predictable amount of time), then it would mean its entire reasoning structure is repeated horizontally through the layers of its architecture. That would be both limiting (i.e. limit the depth of the reasoning to the width of the network) and very expensive to train.

Androider2y ago

The equivalent for a human would be an reflexive response to a question, the kind you could immediately answer after being woken up at 3am in the morning. That type of answer has been deeply trained into the human networks and also requires no deep insight.

But if a human is allowed time and internal reasoning iterations, so should the LLM when determining if it has deep insight. Right now we're simply observing input -> output of LLMs, the equivalent of snap answers from a human. But nothing says it couldn't instead be an input -> extensive internal dialogue, maybe even between multiple expert models for seconds, minutes or hours, that are not at all visible to the prompter -> final insightful answer. Maybe future LLMs will say, "let me get back to you on that".

lukev2y ago

Completely agree.

From a computer science point of view: a single prompt/response cycle from a LLM is equivalent to a pure function; the answer is a function of the prompt and the model weights and is fundamentally reducible to solving a big math equation (in which each model parameter is a term.)

It seems almost self evident that "reasoning" worthy of the name would involve some sort of iterative/recursive search process, invoking the model and storing/reflecting/improving on answers methodically.

There's been a lot of movement in this direction with tree-of-thought/chain-of-thought/graph-of-thought prompting, and I would bet that if/when we get AGI, it's a result of getting the right recursive prompting pattern + retrieval patterns + ensemble models figured out, not just making ever-more-powerful transformer models (thought that would certainly play a role too.)

The LLM isn't the whole brain. Just the area responsible for language and cultural memory.

2 more replies

I_Am_Nous2y ago

This reminds me of The Last Question by Isaac Asimov. I also think if we stopped expecting all LLMs to have an immediate answer, it would be relatively easy to shim some kind of "conscience" to direct the output in different ways. Similar to the safeties already in place in LLMs, but instead of it just saying "NO DON'T SAY THAT" it can dialog internally to change what the output is until it reaches what it believes to be the agreed upon best answer.

2 more replies

jvanderbot2y ago

I think on of the reasons we require so much data is that we try to bake all that "simulated experience and internal dialogue" into the snap responses. I bet if you could do an efficient sim/test retraining, you'd do data-driven responses on the fly.

two_in_one2y ago

Just note that loop doesn't have to be visible from outside. It can be internal, with another driving thread asking right questions. Inner monologue. Then the summary is given back to user. This will give the model space for 'thinking' with internally generated text much large than the visible prompt + output. This way multi-step logic can be implemented.

chii2y ago

> with another driving thread asking right questions. Inner monologue.

spoilers warning:

Is that basically the plot to westworld?

two_in_one2y ago

Inner monologue idea was around for a while. That's interesting that it actually materializes. It was thought that AGI will be operating with some abstracts, logic rules, probability calculations, etc. Not thinking in plain English.

lixy2y ago

Yep, prototype exactly that this past week. With a strong instruction spec prompt from the start, you can have an AI come up with a much better answer by making sure it knows it has time to answer the questions and how it should approach the problem in stages.

The great part is with clear enough directions it also knows how to evaluate whether its done or not.

hskalin2y ago

Can you share the kind of prompts you used?

uoaei2y ago

> They're just system 1 thinking, the equivalent of putting a gun to someone's head and asking them to solve a large division in 2 seconds.

No, it's the equivalent of putting a gun to someone's head and asking them "what are my intentions?" Which is readily available to any being with a theory of mind.

choudharism2y ago

I don’t think LLMs have theory of mind, but your point is not very strong. You can literally query ChatGPT right now and see that it can figure out intentions (both superficial and deep) of a gun is held to a head quite easily.

Because, obviously, training data probably includes a decent amount of motivation breakdowns as a function of coercion.

It doesn’t know why, but it knows what to say.

famouswaffles2y ago

LLMs don't fail those kind of tasks though. 4 is very good at keeping track of who knows what and why in a story. You can test this yourself.

FrustratedMonky2y ago

Don't think so.

Put gun to persons ahead.

Ask them to do a division.

Then screaming at them "HOW DID YOU DO THAT, TELL ME NOW, OR YOU'RE TOAST".

Even most humans would splutter and not be able to answer.

1 more reply

bongodongobob2y ago

What? My answer would be "I have no idea!"

Are they faking it? Will they murder me? Are they just trying to scare me?

I don't understand the point you're trying to make.

twobitshifter2y ago

There are plenty of models that use introspection and check answers, that’s the idea behind let’s think about it step by step.

tinco2y ago

I feel the "let's think about it step by step" is a bit of a hack. To circumvent the fact that there's no external loop you use the fact that it gets re-run on every token so you can store a bit of state in the tokens that it's already generated.

Or am I misunderstanding something about that technique?

twobitshifter2y ago

You are right, it’s sometimes called zero shot chain of thought, but it’s a way of getting the type of thing you are describing to happen. The LLMs somehow process things in a perceived step by step to get a much improved answer. Whether the external loop or an llm imposed internal loop, does it matter? Are our own minds looping or just adding tokens?

2 more replies

creer2y ago

At the same time, if LLMs are based on all or enough human writings then don't they necessarily contain a theory of mind? A rather general, smoothed out and still neurotic one probably. But still just like an LLM can't be expected to have a specific knowledge of hydraulics, it also has read more hydraulics than even experts might be expected to. That's the entire issue about it, right? This issue of "is most of our mind basically just mixing and matching stuff we have seen, read, heard?" Do humans have some magical theory of mind that somehow stands ASIDE from all the "normal" learned stuff?

Of course, yes, we do know one thing that's missing in LLMs, which is "loop and helpers" like you describe. Which I'm sure many people are currently hacking at - one way being for the LLM to talk to itself.

But as for "a theory of mind", if enough writings served as input, then LLMs do have plenty of that.

Another question is whether LLMs are raised to behave like humans (which might be where they most NEED some theory of mind). Of course not. The ones we know most about are only question answerers. The theory of mind they might have (that is not negated by the lack of loop and internal deliberation) may be overwhelmed by the pre- and post-processing: "no sex, no murder plots, talk to the human like they are 5, bla bla bla". And yet you can ask things like "Tell it like you are speaking to 5 year olds who want to have a fun time". Some theory of mind makes it through.

neuronerdgirl2y ago

That's not what theory of mind is. Theory of mind is the understanding - not academic understanding, but functional, meaningful understanding that guides your behavior - that the other people (AIs?) around you have their own separate thoughts and feelings from your own. This is what leads to the stark behavioral differences as a child develops from ~3-6 years of age. Imagine a child is playing with their toys and puts one of them in a box out of the room, then later asks you to retrieve it for them. A 3 year old will assume that, because they know they put it in the box, that you know it is there as well. (Incidentally, this is where a lot of tantrums come from, because they can't understand why you aren't reading their mind and get frustrated.) A 6 year old will not make this assumption.

While this is a simple example, the concept extrapolates to more complex issues as we develop cognitively and emotionally. Not only is theory of mind about recognizing that your thoughts and feelings are separate and distinct, but also being able to project your understanding into others to predict how they might be thinking and feeling in their own separate circumstance. This is the fundamental basis of empathy, or being able to predict and understand how others around you may be feeling given their unique current circumstances even if they are different from your own.

LLMs have not demonstrated any of the above understanding. Again, I'm sure they could generate a definition for it, and on any given prompt they could probably even generate some generic-ish text that could pose as empathy, but so can a horoscope. But there is not the deep, continuous comprehension of the user as a separate thinking and feeling agent that is fundamental to the idea of theory of mind.

creer2y ago

I think this branch of the discussion was granting an "of course, for the moment" on this because of the lack of loop / internal conversation.

After that, the entire field of AI is about making bits and register transfers - a written form of knowledge - into something indistinguishable from the wet version. Making a priori and irreconcilably different "academic understanding" and "functional, meaningful understanding" is going to be a problem.

rf152y ago

they can't reason though, sadly - the premise does not hold.

TeMPOraL2y ago

That's literally GP's point though - they don't reason because reasoning requires a loop, which you deny them and then say "see, it can't reason".

1 more reply

FrustratedMonky2y ago

So your premise is correct? Please back up the opposite. If you can, you should publish.

These responses are logically the same as "No You".

1 more reply

menssen2y ago

I appreciate this paper for relatively clearly stating what "human-like" might entail, which in this case involves "reasoning about the causes behind other people's behavior" which is "critical to navigate the social world" as outlined in this citation:

https://www.sciencedirect.com/science/article/abs/pii/S00100...

I get frustrated often when people argue "well, it isn't really intelligent" and then give examples that are clearly dependent on our brain's chemical state and our bodies' existence in-the-physical-world.

I get the feeling that when/if we are all enslaved by a super-intelligent AI that we do not understand its motives, we will still argue that it is not intelligent because it doesn't get hungry and it can't prove to us that it has Qualia.

This paper argues that gpts are bad at understanding human risk/reward functions, which seems like a much more explicit way to talk about this, and also casts it in a way that could help reframe the debate about how human evolution and our physical beings might be significantly responsible for the structure of our rational minds.

swatcoder2y ago

The underlying problem is that "intelligence" is itself a crappy, poorly defined word with a fraught and inconsistent history.

It doesn't appear until the early 20th century, in the shadow of compulsory education and the challenges it presented, first as a technical label for attempts to sort students -- and later soldiers -- into the tracks in which they're most likely to succeed, and then being haphazardly asserted (but not scientifically evidenced) as some general measure of mental aptitude.

At that point it shifts from something qualitative (which mental tasks might someone be good at) to something quantitative (how much more might one personal excel at all mental tasks than another), and the burgeoning field of modern American psychology goes "Aha! A quantitative measure! Here's our meal ticket to being recognized as a science instead of those quacks from Vienna", with far too much at stake to either question the many assumptions at play or the inconsistent history of usage.

Momentum takes hold and the public takes the word into its everyday vernacular, even while it's still not a clear and sound concept in its technical domain. [Most of this is history is more academically covered in Danziger's 1987 "Naming the Mind" which is excellent, and critical foundational reading to contextualize recent hot discussions in AI]

The way you're using it when you worry about "super-intelligence" is in the sense of intelligence being some universal, unbounded, quantitative independent variable along the lines of "the more intelligent something is, the more cunningly it can pursue some rationalized goal" -- some master strategist.

That's fine, and you're not alone in that, but there's not really any sound scientific groundwork to establish that there exists some quality of the world that scales like that. You're fear, and what you try to distinguish conceptually from what the paper addresses, is an inductive leap made from highly unstable ground. It's in the same invented, purely abstract idea-space of "omnipotence" or "omniscience" where one takes a practical idea like "power to influence" or "ability to know fact" and inductively draws a line from these practical senses towards some abstract infinite/incomprehensible version of that thing. But that inductive leap a Platonic logician's parlor trick and ends up raising all kinds of abstract paradoxes, as well countless physical impracticalities about how such things could exist.

So a lot of people (academic and lay) just aren't with you in taking that framing of intelligence very seriously. For many, an "super-intelligent" software whose "motives" we don't understand is just a program that produces incorrect outputs and ought to be debugged or retired, and the more interesting questions around machine "intelligence" are practical ones like "what tasks are these programs well-suited for". Here, the authors point out that the current batch of programs are not good at tasks that benefit from a theory of mind.

Knowing the answer to that kind of question reaches back to the earliest and least disputable sense of the word, where we saw that some new students and soldiers excelled at certain tasks and struggled with others, and wanted to understand how best to educated/assign them. And likewise, as we look at these tools, the pressing question for engineers and businesses is "what are they good for and what are they not good for" rather than the fantastical "what if we make a broken program and it wants to kill everyone and we don't notice and forget to shut it off"

thewakalix2y ago

> and it wants to kill everyone

It wouldn't have to want to kill everyone. As long as it doesn't want to not kill everyone, the side effects of it getting what it wants could be catastrophic.

> and we don't notice

How well do we understand what's going on inside ChatGPT? How well will we understand the next?

> and forget to shut it off

Earlier I would have argued that sufficiently advanced AI could prevent itself from being shut off via Things You Didn't Expect, and would instrumentally want to preserve its existence. But these days, people are giving ChatGPT not just internet access but even actively handing it control over various processes. At this rate, the first superhuman AI will face not an impermeable box but a million conveniently labeled levers!

TeMPOraL2y ago

> Earlier I would have argued that sufficiently advanced AI could prevent itself from being shut off via Things You Didn't Expect

There's a good argument along these lines that I keep reposting when someone asks if we can't just shut the AI off. "All you gotta do is push a button, sir?"

https://www.youtube.com/watch?v=ld-AKg9-xpM&t=30s

kens2y ago

> It [the word "intelligence"] doesn't appear until the early 20th century

I'm not sure what you mean here, since the word dates back to the late 14th century with roughly the same meaning as now. Perhaps you're thinking of "intelligence quotient"?

https://www.etymonline.com/word/intelligence

swatcoder2y ago

Summary etymology can provide interesting reference points when looking a the history of ideas, but isn't sufficient because adjacent concepts change their meaning over time as well. It's good for showing when a word was attested and where to start looking for an understanding of how it was used and considered.

Where you say "roughly the same meaning as now" you seem to mean that "the highest faculty of the mind, capacity for comprehending general truths;" is how we think of intelligence now, but the meanings of "mind", "truth" "comprehending" and "faculties of mind" have all had their own radical shifts over the last 600 years. That quoted phrase conveys an entirely different perspective and set of assumptions/implications in the context of its time, and is not at all analogous to how we read it today.

Raymond Williams' "Keywords" collects a very interesting and accessible collection of examples of this phenomenon, although it focuses more on the language of politics and society more than the language of psychology.

The modern use of intelligence, and the conceptual constellation it represents, is essentially isolated from what's described in that article, but it's re-introduction in modern psychology does borrow from its prior existence in the lexicon.

joe_the_user2y ago

I appreciate the highlighting of the term intelligence being ill-defined. Moreover, it's certainly true that "AI safety analysts" takes intelligence as a sort magic wand term and this seems to drive their arguments.

All that said, since both computers and human brains are material artifacts, it doesn't seem impossible to create a device that combines their properties. It seems plausible that such a thing could have a variety of dangers.

For many, an "super-intelligent" software whose "motives" we don't understand is just a program that produces incorrect outputs and ought to be debugged or retired, and the more interesting questions around machine "intelligence" are practical ones like "what tasks are these programs well-suited for".

We saw early Bing Chat behave, not in ways we couldn't understand but like a deranged and vengeful human. Certainly, it was merely simulating human behavior but if today's methods produce artifacts that unselectively amplify human behaviors, it's not hard to imagine problems appearing.

We can hope that there's a fundamental difference between programs that simulate human language and programs able to plan and carry out long term goals (and carrying out long term goals is something people do so there's no good reason some kind of program couldn't do that).

I think you're right that particular weirdness of the "doomers" makes some other portion of the population dismiss concerns. But that isn't an argument that the doom isn't possible - it should be an argument to clarify how we talk of computation and human capacities (see, I don't to say "intelligence" unless I want to).

famouswaffles2y ago

There does seem to be a general factor of intelligence in humans though that is the single biggest indicator of performance. Yes there are other factors too.

>Here, the authors point out that the current batch of programs are not good at tasks that benefit from a theory of mind.

Not good at tasks that benefit from a theory of mind extracted from visual data.

throwanem2y ago

"Seem" is doing a lot of work here. So is the implicit claim that theory of mind in general can be demonstrated by current-gen foundation models, and only those aspects dependent on vision cannot.

2 more replies

menssen2y ago

For what it's worth, I don't take that framing of intelligence seriously either. It's useful to have a word to describe the far-future state of the increasing capabilities of Computers.

I'm just saying that I don't think there's any point on that line where we will be comfortable admitting that the machine is "intelligent" or "conscious" or "AGI," or whatever, and that I appreciate attempts to quantify (or at least qualify) what we MEAN when we say that, rather than just goalpost-moving.

pixl972y ago

Most of what you're saying here is describing the alignment issue.

We (mostly) don't want unaligned A(G|S)I. The outcomes of that could be extenstential.

swatcoder2y ago

Only for those mundane senses of alignment where we say "This system is reliable in tasks that look like X and unreliable in tasks that look like Y, so let's craft hard boundaries to avoid naive use for Y"

But it's skeptical of the other sense alignment, where a potential Master Strategist needs to be trained or crippled before it outsmarts us. It sees that perspective as comparable to logicians debating whether we might live in the domain of a benevolent or evil omnipotence: "if an ant is more powerful than a rock, and I'm more powerful than an ant, then perhaps there is something so powerful that it encompasses all opportunities to influence the universe including the power to hide itself from me." -- which comes from taking a concrete measure, assuming that it's an independent variable, and then inductively extending it to an infinite or otherwise unevidenced scale. This technique is undisprovable and so it's easy for "rational" people to mine work from it for a very long time, but history and analysis give room for skeptics to be like "WTF you going on about; let's have some tea"

1 more reply

mistermann2y ago

> I get frustrated often when people argue "well, it isn't really intelligent" and then give examples that are clearly dependent on our brain's chemical state and our bodies' existence in-the-physical-world.

A big part of the problem is that "is" has a wide variety of inconsistent meanings, and that this fact is sub-perceptual, and that it is culturally very inappropriate to comment on aspects of our culture like this, preventing knowledge of the problem from spreading.

/u/Swatcoder makes essentially the same point but in much more detail, though regarding less important words.

fredliu2y ago

I have small kids, toddlers, who can already speak the language but still developing their "sense of the world" or "theory of mind" if you will. Maybe it's just me, but talking to toddlers often reminds me of interacting with LLMs, where you would have this realization from time to time "oh, they don't get this, need to break down more to explain". Of course LLM has more elaborate language skills due to its exposure to a lot more text (toddlers definitely can't speak like Shakespeare if you ask them, unless, maybe, you are the tiger parents that's been feeding them Romeo and Juliet since 1.), but their ability of "reasoning" and "understanding" seems to be on a similar level. Of course, the other "big" difference, is that you expect toddlers to "learn and grow" to eventually be able to understand and develop meta cognitive abilities, while LLMs, unless you retrain them (maybe with another architecture, or meta architecture), "stay the same".

TeMPOraL2y ago

> Maybe it's just me, but talking to toddlers often reminds me of interacting with LLMs

It's not just you. It hit me almost a year ago, when I realized my then 3.5yo daughter has a noticeable context window of about 30 seconds - whenever she went on her random rant/story, anything she didn't repeat within 30 seconds would permanently fall out of the story and never be mentioned again.

It also made me realize why small kids talk so repetitively - what they don't repeat they soon forget, and what they feel like repeating remains, so over the course of couple minutes, their story kind of knots itself in a loop, being mostly made of the thoughts they feel compelled to carry forward.

Terretta2y ago

And, if you change their context, the story unspooling will change.

1 more reply

passion__desire2y ago

It's not just true about toddlers but also for adults in particular time frame. Maturity of thought is cultural phenomenon. Descartes used to think animals are automaton while they behaved exactly like humans in almost all aspects in which he could investigate animals and humans during those times and yet he reached illogical conclusion.

fredliu2y ago

That's a great point. Just thinking out loud, if we can time travel back to the cavemen time, and assuming we speak their language, there would still be so much that we couldn't explain or they wont' be able to understand even for the smartest cavemen adults. Unless, of course we spend significant time and effort to "bring them up to speed" with modern education.

1 more reply

nomel2y ago

I don't think it has anything to do with brain development. I think it's entirely related to the development of an individual concept, whenever the structure of ideas that make the concept is too simple.

I would claim that most people use intuition/assumptions rather than internal chain-of-thought, when communicating, meaning they will present that simplified concept without second thought, leading to the same behavior as the toddler. It's actually trivial to find someone that doesn't use assumptions, because they take a moment to respond, using an internal chain-of-thought type consideration to give a careful answer. I would even claim that a fast response is seen as more valuable than a slow one, with a moment of silence for a response being an indication of incompetence. I know I've seen it, where some expert takes a moment to consider/compress, and people get frustrated/second guess them.

joduplessis2y ago

For me, the entire AGI conversation is hyperbolic / hype. How can we infer intelligence to something when we, ourselves, have such a poor (none) grasp of what makes us conscience? I'm associating intelligence with consciousness - because it seems correlated. Are we really ready to associate "AGI" with solving math problems ("new Q algo.")? That seems incredibly naive & reinforces my opinion that LLM's are much more like crypto, than actual progress.

corethree2y ago

It's not hype. It's a language problem that makes people like you think this way.

The problem is consciousness is a vocabulary word that establishes a hard boundary where such a boundary doesn't exit. The language makes you think either something is conscious or it is not when the reality is that these two concepts are actually extreme endpoints on a gradient.

The vocabulary makes the concept seem binary and makes it seem more profound then it actually is.

Thus we have no problem identifying things at the extreme. A rock is not conscious. That's obvious. A human IS conscious, that's also obvious. But only because these two objects are defined at the extremes of this gradient.

For something fuzzy like chatGPT, we get confused. We think the problem is profound, but in actuality it's just poorly defined vocabulary. The word consciousness, again, assumes the world is binary that something is either/or, but, again, the reality is a gradient.

When we have debates about whether something is "conscious" or not we are just arguing about where the line of demarcation is drawn along the gradient. Does it need a body to be conscious? Does it need to be able to do math? Where you draw this line is just a definition of vocabulary. So arguments about whether LLMs are conscious are arguments about vocabulary.

We as humans are biased and we blindly allow the vocabulary to mold our thinking. Is chatGPT conscious? It's a loaded question based on a world view manipulated by the vocabulary. It doesn't even matter. That boundary is fuzzy, and any vocab attempting to describe this gradient is just arbitrary.

But hear me out. chatGPT and DALL-E is NOT hype. Why? Because along that gradient it's leaps and bounds further than anything we had just even a decade ago. It's the closest we ever been to the extreme endpoint. Whichever side you are on in the great debate both sides can very much agree with this logic.

nprateem2y ago

> A rock is not conscious. That's obvious.

But it's not obvious at all. It may possess consciousness in a way we can't relate to or communicate with.

This is the whole problem with consciousness and has been discussed by philosophers for centuries. We each appear to be conscious but can't be certain anything else is or isn't.

corethree2y ago

You're not reading my argument. You responded to one part of it. As a whole my argument is not about the problem you're describing.

My argument is saying that the problem is a sham. An illusion. The problem doesn't even exist. Read my whole write up.

1 more reply

poulsbohemian2y ago

Completely agree, and while we are at it... look I'm just a guy, not an expert, but I can't understand why there's so much focus on AGI. It feels like there are so many niche areas where we could apply some kind of analytical augmentation and by solving problems in the small, might learn something that would help figure the larger question of intelligence. I don't need the AI to replace everything I do, I need it to solve 10,000 micro problems I solve every day - each of which is a business opportunity for someone.

tivert2y ago

> ...but I can't understand why there's so much focus on AGI.

Lots of software engineers have spent their lives reading sci-fi that features AGI, and they're excited by/lost in that fantasy.

It's interesting to see that in people who often view themselves as hyper-rational.

2 more replies

pixl972y ago

>I need it to solve 10,000 micro problems I solve every day - each of which is a business opportunity for someone.

Because you have to solve 10,000 different problems. And a huge number of those problems are going to have significant overlap, but sharing lessons between them is going to be difficult unless you have a generalized algorithm.

Hence AGI is the trillion dollar question.

sgregnt2y ago

Many of the seemingly small problems do require a good model of the world for context and edge case solving, so they still get very close to general intelegence.

1 more reply

Avicebron2y ago

I've thought about this a bit as well, and I think it's almost like this toxic concoction of incentive (how can "we" hype this until and make boatloads of money off of it, coupled with a genuine (if sub-conscious) desire to be seen as a visionary/great engineer who "created artificial life." I mean, at least on HN, I see lots of this aspirational attitude for living the sci-fi future circa. Star trek, ex-machina, etc, while couching their language in professions of expertise now that the firehose of cash has turned on.

Also there is the general hubris in all this to only look at the new and shiny, I remember when there was that pizza robot (some multi-dimension axis hand thing) that cost whatever in building and research, when the costco pizza "robot" is pretty darn good, but doesn't sell as "futuristic/cool" because its a spigot on a servo.

RGamma2y ago

A(G)I models don't need higher order thinking or somesuch to be impactful. For that they just need to increase productivity with or without job loss (be Good Enough), which they are on a good track for.

The real impacts will come when they are properly integrated into the current computational fabric, which everyone is racing to do as we write this.

nyrikki2y ago

A particular subset of Connectivism have a philosophical belief that the mind IS a neutral net, not that it is a reductive practical model.

Hinton is one of these individuals and with no definition of what intelligence is it is an understandable of dogmatic position.

This whole problem of not being able to define what intelligence is pretty much allows us all to pick and choose.

In my mind BPP is the complexity class solvable by ANNs and it is a safe and educated guess that most likely BPP=P.

BPP being one of the largest practical complexity classes makes work in this area valuable.

But due to many reasons that I won't enumerate again AGI simply isn't possible and requires a dogmatic position to believe in for people who have even a basic understanding of how they work and the limits from the work of Gödel etc...

But many of the top scientists in history have been believers of numerology etc...

Associating math with LLMs is a useful too to avoid wasted effort by those who don't believe AGI is close, but it won't convince those who are true believers.

LLM's are very useful for searching very large dimensional spaces and for those problems that are ergotic with the Markov property they can find real answers.

But for most of what is popular in the press will almost certainly be a dead end for generalized use of the systems are not extremely error tolerant.

Unfortunately it may take another AI winter to break the hype train but I hope not.

IMHO it will have a huge impact but overconfident claims will cause real pain and misapplication for the foreseeable future.

blovescoffee2y ago

You should state what AGI is and then state why it’s nog possible.

upghost2y ago

Couldn't agree more. How about this -- I think we've already reached AGI. Let me know if this tracks: Pick a set of tasks that can be considered AGI tasks. Provided the task sequences can be compared as closer to AGI or further from AGI, we can create a reward model using the same techniques as were used by ChatGPT via RLHF. Thus, for any definition of AGI that is meaningful and selectable, even if subjectively selectable or arbitrarily preferential, we can create a reward model for it.

You might say, well thats not AGI, AGI must also do such and such. Well, we can get arbitrarily close to that definition as well via RLHF.

Another objection might be: well, if thats the definition of AGI, that seems really underwhelming compared to the hype train. This says nothing about autonomy, sentience, free will -- exactly. Those concepts can or should be orthogonal to doing productive work.l, IMHO.

So, there it is. We can now make a reward model for folding socks, and use gradient descent with RL to do the motion planning.

Maybe thats AGI and maybe its not, but I'd really love it if we had a golden period between now and total enshittification that involved laundry folding robots.

FrustratedMonky2y ago

Exactly. 5 years ago, we would have said what GPT-4 is doing now would be AGI.

Now it is here and it's like "No, what we really meant is it has to be the next Einstein".

People are forgetting how stupid people are.

GPT is already better than average human.

Most people can't do what we claim GPT must be capable of to qualify as AGI.

The only logical conclusion is that many people are also not conscious and don't qualify as being able to reason.

krainboltgreene2y ago

> Exactly. 5 years ago, we would have said what GPT-4 is doing now would be AGI.

okay, and 500 years ago we would have said it was magic, that doesn't make it magic. people who don't understand the thing often are confused about the thing. as soon as you explain how the whole mechanism works it's obvious that it's not that thing.

> People are forgetting how stupid people are. GPT is already better than average human. Most people can't do what we claim GPT must be capable of to qualify as AGI. The only logical conclusion is that many people are also not conscious and don't qualify as being able to reason.

citation massively needed, this sounds like it was written by someone who thinks idiocracy was a documentary and not a comedy.

1 more reply

mistermann2y ago

> The only logical conclusion is that many people are also not conscious and don't qualify as being able to reason.

Most humans aren't "able to" juggle 3 balls, but most humans are physically and mentally capable of learning to be able to juggle 3 balls, it's just not a common thing for most people to learn. The same used to be true of reading and basic math, but look where we are now with some good planning and hard work!

AnimalMuppet2y ago

Well... humans have different mental "spaces" (not intended as a technical term).

Let's say I'm deep in a coding problem. A co-worker comes by and says "How did your team do in the game yesterday?". I say, "Um, uh... sorry, my head's not there right now." It takes us time to swap between mental "spaces".

So, if I have an AGI (defined as having a trained model for almost everything, even if that turns out to be a large number of different models), if it has to load the right model before it can reason on that topic, then that's pretty human-like. (As long as it can figure out which model to load...)

The one thing missing is that (at least some) humans can figure out linkages between different mental "spaces", to turn them into a more coherent holistic mental space, even if they don't have (all of) each space at front-of-mind at any moment. I'm not sure if this flavor of an AGI could do that - could see the connections between different models.

pixl972y ago

The power of analogy is one of the most important things that humans seem to have.

Humans typically use the toolset they've seen along the way to solve problems (hence if you have a hammer all problems become nails statement). When you get people that are multi-disciplinary they commonly can solve a complex problem in one field by bringing parts of solutions from other fields.

Hence if you have more life experiences (especially positive/learning ones) you are typically better off then a person who does not.

Also I think this is where a lot of interest in Q* learning after the OpenAI thing occurred, as this would be a means of allowing an AI to explore problem spaces and enlist specialist AI and tools for it to do so.

1 more reply

pshc2y ago

It tends to boil down to semantics, but as I interpret it, a threshold for AGI has to involve some breakthrough in generalizable, adaptable intelligence. So, yeah, I would invoke the "that's not AGI" move.

An AGI should be able to solve any creative problem a human could, with drive and knowing purpose and coherent vision. The LLMs are still narrowly focused and require human supervision.

We might well get there with chained AIs automatically training new reward models for each new problem, or by some other paradigm, but I don't feel like we're past the threshold yet.

33a2y ago

Looking at their data and their experiments, I'd actually come to the opposite conclusion of the title. It's true that current LLMs are probably not quite at human level performance for these tasks, they're not that far off either and clearly we see as models increase in size and sophistication their performance on these tasks are improving.

So it seems like maybe a better title would be "LLMs don't have as advanced a theory of mind as a human does... for now..."

famouswaffles2y ago

Indeed. Not sure what i was expecting reading the title but "GPT-4V is close to or matching human median performance on most of these tasks" was not it.

hiddencost2y ago

Another paper in a long series that confuses "our tests against currently available LLMs tuned for specific tasks found that they didn't perform well on our task" with "LLMs are architecturally unsuitable for our task".

marcosdumay2y ago

Our tests against current cars found that they didn't perform well on transatlantic flights... But who knows what the future holds? Maybe we should test them again next year.

LLM names an specific product, aimed at solving an specific problem.

LesZedCB2y ago

our tests of combustion engine driven crankshafts connected to a spinning mechanism can perform well in both transatlantic flights and cross country road trips.

AnimalMuppet2y ago

That's not the hard part about building a working airplane, though...

1 more reply

dsr_2y ago

There is no reason to believe (evidence) that any meaning ascribed to an LLM's utterances comes from the LLM rather than being pareidolia.

If you've found some, please let everyone know.

Tadpole91812y ago

There is no reason to believe (evidence) that any meaning ascribed to anyone but me's utterances comes from the person rather than being pareidolia.

If you've found some, please let everyone know.

mistermann2y ago

It's worse: there is technically no way to know if there is in fact no evidence, it is a colloquial phrase that people cannot think twice about.

kkzz992y ago

You would first have to define what you mean with "meaning" and "pareidolia" in this context.

LesZedCB2y ago

all text ever written has only ever had meaning imbued by the reader (including this text).

famouswaffles2y ago

It's a weird title anyway. I was expecting worse results but GPT-4V is close to or matching Human median performance on most of the tests besides the multimodal "Intuitive Psychology" tests.

deeviant2y ago

> A chief goal of artificial intelligence is to build machines that think like people.

I disagree with the topic sentence.

The goal should not be to "build machines that think like people", but to build machines that think, period. The way humans think is unlikely to be the optimal way to go about thinking anyways.

Instead of talking about thinking, we should be talking about function. Less philosophy and more reality. Can the system reason itself through various representative challenges as well as or better than human? If yes, it doesn't much matter how it does it. In fact, it's probably for the best if we can create AI that thinks completely different than humans, has no consciousness or self awareness, but still can do what humans can do and more.

mcguire2y ago

The problem here is, how do you know that your machine thinks if it doesn't think like humans?

Game AIs are functionally much better than humans but no one believes they can think, right?

Oh, but if you are arguing for AI from a specialized tool standpoint and not a general intelligence standpoint, if you are talking about "weak" AI rather than "strong" AI, then I'm right there with you. :-)

randcraw2y ago

The topic sentence was the mantra of nearly all AI research back in the days of good-old-fashioned-AI, AKA symbolic AI. Understanding how reasoning is implemented by our brains was a much more compelling prospect than being able to implement 'intelligence' compositionally but without understanding how software achieved it -- which is largely where we find ourselves now. Today's AI is theory-free leaving us unenlightened about the continuum of intelligence -- across species, or within a human as our brain matures or goes pathological.

Many scientists outside the AI field have long shared an interest in the objective of how to "think like people" using software. Far fewer care if the AI is inexplicable (or if it can't be dissected into constituent components, thereby enabling us to explore the mind's constraints and dependencies among its cognitive processes).

pixl972y ago

The problem here is this breaks the much more complicated issue of alignment.

The paperclip optimizer is a great parable here. If you build your intelligence to build as many paperclips as cheaply as possible don't be surprised when said intelligence disassembles you and the rest of the universe to do so.

So yea, HOW starts mattering a whole lot when you want to ensure it understands that it shouldn't do some particular things.

trash_cat2y ago

We don't make planes based on how birds flap their wings.

fnordpiglet2y ago

In Buddhism there’s the idea that our core self is awareness, which is silent - it doesn’t think in a perceptible way, it doesn’t feel in a visceral way, but it underpins thought and feeling, and is greatly impacted by it. A large part of meditation and “release of suffering” is learning to let your awareness lead your thinking rather than your thinking lead your awareness.

To be clear, I think this is in fact a correct assessment of the architecture of intelligence. You can suspend thought and still function throughout your day in all ways. Discursive thought is entirely unnecessary, but it is often helpful for planning.

My observation of LLMs in such a construction of intelligence is they are entirely the thinking mind - verbal, articulate, but unmoored. There is no, for lack of a better word, “soul,” or that internal awareness that underpins that discursive thinking mind. And because that underlying awareness is non articulate and not directly observable by our thinking and feeling mind, we really don’t understand it or have a science about it. To that end, it’s really hard to pin specifically what is missing in LLMs because we don’t really understand ourselves beyond our observable thinking and emotive minds.

I look at what we are doing with LLMs and adjacent technologies and I wonder if this is sufficient, and building an AGI is perhaps not nearly as useful as we might think, if what we mean is build an awareness. Power tools of the thinking mind are amazingly powerful. Agency and awareness - to what end?

And once we do build an awareness, can we continue to consider it a tool?

pixl972y ago

https://en.wikipedia.org/wiki/Moravec%27s_paradox

While you're adding a bunch of eastern philosophy to it, we need to take a step back from 'human' intelligence and go to animal and plant intelligence to get a better idea of the massive variation in what covers thought. In animal/insects we can see that thinking is not some binary function of on or off. It is an immense range of different electrical and chemical processes that involve everything from the brain and the nerves along with chemical signaling from cells. In things like plants and molds 'thinking' doesn't even involve nerves, it's a chemical process.

A good example of this at the human level is a reflex. Your hand didn't go back to your brain to ask for instructions on how to get away from the fire. That's encoded in the meat and nerves of your arm by systems that are much older than higher intelligence. All the systems for breath, drink, eat, procreate were in place long before high level intelligence existed. Intelligence just happens to be a new floor stacked hastily on top of these legacy systems that happened to be beneficial enough it didn't go extinct.

Awareness is another one of those very deep rabbit hole questions. There are 'intelligent' animals without self awareness, but with awareness of the world around them. And they obviously have agency. Of course this is where the AI existentialists come in and say wrapping up agency, awareness, and superintelligence may not work out for humans as well as we expect.

thegiogi2y ago

> A good example of this at the human level is a reflex. Your hand didn't go back to your brain to ask for instructions on how to get away from the fire.

Is this actually true? I thought it just involved a different part of the brain. Is there actually no brain involvement? Sure it does not need your awareness or decision making, but no brain? I find that hard to believe.

1 more reply

danenania2y ago

Another idea from Buddhism is that this core of awareness you're talking about is nothingness. So when you stop all thought (if such a thing is really possible), you temporarily cease to exist as an individual consciousness. "Awareness" is when the thoughts come back online and you think "whoa, I was just gone for a bit".

If that's how it works, then the "soul" is more like an emergent phenomenon created by the interplay between the various layers of conscious thought and the base layer of nothingness when it's all turned off. That architecture wouldn't necessarily be so difficult to replicate in AI systems.

nprateem2y ago

You've misunderstood. Awareness absolutely is not "when thoughts come back online".

This is simple to experience for yourself since it'd mean we stop being aware when listening so intently thoughts stop. Obviously we don't cease to be aware at such times.

You've also misunderstood what is meant by nothingness ("no thingness").

danenania2y ago

Ok? I’m not an expert, but I’ve read enough to know that there are many differing takes on these topics within Buddhist thought. Your appeals to dogma are not convincing.

NoMoreNicksLeft2y ago

> So when you stop all thought (if such a thing is really possible),

It's not. They don't realize it, they're merely referring to stopping your internal monologue. There are dozens of other mental processes going on in any given waking moment. Even actual top shelf cognition is going on, it just occurs in a "language of thought".

astrange2y ago

> It's not. They don't realize it, they're merely referring to stopping your internal monologue.

They certainly have realized that. It's one of the first things you notice doing awareness meditation; thoughts appear from nowhere even if you didn't try to think them.

1 more reply

fnordpiglet2y ago

I think what you mean is that in Buddhism there is no self beyond the self implied by your thinking mind. The nothingness you refer to is the eschewing of attachment to what isn’t and being simply what is. It doesn’t mean a void, it means that all existence is within the awareness, which isn’t directly observable and is constantly changing. As such, it’s effectively nothing - except it is literally all you are. Your past and memories are just crude encodings, the future is a delusion. Your self identity has almost nothing to do with who you actually are right now. Your dissatisfaction with your situation isn’t meaningfully different from your delight in some experience - they’re both transient, and are just experiences of the present. You can avoid unpleasantness, and enjoy pleasure, but holding onto and seeking or avoiding entangles your awareness in what isn’t to the determinant of what is. As you continue releasing the various attachments and let the awareness take hold, and actions come naturally without thought or attachment, you cease suffering and cease causing suffering.

But to my understanding the idea of nothingness being some objective in Buddhism isn’t the case - but it’s often described as such because that state of pure awareness without encumbering thought and attachment in many ways to an unpracticed person feels like nothingness. After all, the awareness is silent, even if it is where all thought and feeling spring from.

Finally, awareness isn’t that moment you snap back to thought. You’re always aware. We just tend to be primarily aware of our thoughts and emotions. We walk around in a haze of the past and future and fiction as the world ticks by around us, and we tend to live in what isn’t rather than what is. You don’t disappear in the sense that you cease to be as an individual mind, you are always yourself - that’s a tautology. What you lose is the sense of some identity that’s separate from what you ARE in this very moment. You aren’t a programmer, you aren’t a Democrat, you aren’t a XYZ. You are what you are, and what that is changes constantly, so can’t be singularly defined or held onto as some consistent thing over time with labels and structure. You just simply are.

1 more reply

jacobsimon2y ago

This is a profound question but I also wonder if this non-thinking “awareness” you’re referring to is largely defined by quieting the thinking mind and listening to the senses more directly. A lot of meditation is about tuning out thoughts and focusing on proprioception like breathing, the feelings of the body, etc.

vjerancrnjak2y ago

Fundamentally, this "awareness" isn't defined by quieting the thinking. It is a description of fundamental reality. No individual should be able to experience it, and the "glimpses" are just forms of brain dysfunction.

Meditation techniques that focus on breath or the body are an attempt to make you do the breathing/sensing consciously. If you film yourself and later look at what you did, you'll notice you aren't breathing well when you're breathing consciously, so you're probably depriving yourself of oxygen, lowering blood concentration in certain brain regions and you hope it will be the brain region associated with conceptualizing, language etc.

You can do the same with sleep. You can try to consciously fall asleep, and just like breathing, you will have a hard time because there's a reason why falling asleep is not conscious (or in other words it does not go through the regions of the brain that conceptualize). You can experience the balance center shutting down (feels like falling or turning) and you can go even deeper and feel the fear of the "ego" dying (temporarily). What remains is definitely much different than waking or dreaming state. But it is still not that "awareness/nothingness".

fnordpiglet2y ago

I think this is entirely incorrect. Vipassana meditation, the type focused on breathing, require intense awareness of your breathing and physical body. It’s a similar state to when you intensely focus on what’s around you and everything gets brighter and more vibrant and you pick out a lot of details you normally don’t notice because you’re distracted by your thoughts.

If you’re doing it the way intended you would 100% be aware of your irregular breath or pausing. In fact beginners vipassana often advises counting the breaths individually in a cycle 1..10, and resetting the count when you lose track of your breathing. You intensely focus on the sensation of the air moving through your nostrils, the muscles contracting, your clothing shifting.

However it’s not about controlling your breathing, so it’s not the same as breathing consciously. It’s observing passively. Often you’ll notice that you are breathing irregularly, not because of the meditation, but because you typically are stressed and tight in your musculature due to the way you’re thinking. You can then loosen and reset your patterns of breath to be more natural, deep, and complete.

A goal isn’t to stop with observing the breath though, and you work towards having a total awareness of the entire body at once, shifting your center of existence from your head to the rest of your body. You then incorporate sounds and events in your environment. This requires an intense amount of mental power, and is entirely different from your description of oxygen deprivation. Thought ceases because it interferes with being aware, not because you are experiencing brain death.

1 more reply

sdwr2y ago

Maybe the soul is social, and oriented towards others? I believe it can be constructed.

If you assume that "the eyes are the window to the soul", you notice some interesting properties.

1. It is far more observable from the outside (eyes open/lidded/closed, emotion read in eyes)

2. It affects behavior in a diffuse way

3. It pays attention but does not dictate

munificent2y ago

> Maybe the soul is social

My pet theory about human consciousness is that is that consciousness is simply recursive theory of mind. Theory of mind [1] is our ability to simulate and reason about the mental states of others. It's how we predict what people are thinking and how they will react to our actions, which is critical for choosing how to act in a social environment.

But when you're thinking about what's in someone's head, one of the things might be them thinking about you. So now you're imagining your own mind from the perspective of another mind. I believe that's entirely what our sense of consciousness is. It's our social reasoning applied to ourselves.

If my pet theory is correct, it implies that the level of consciousness of any species would directly correlate to how social the species is. Solitary animals with little need for theory of mind would have no self awareness in the way that we experience it. They'd live in a zen-like perpetual auto-pilot where they do but couldn't explain why they do what they do... because they will never explain it to anyone anyway.

[1]: https://en.wikipedia.org/wiki/Theory_of_mind

crdrost2y ago

Theory of mind is interesting but one wouldn't want to hinge consciousness upon it.

That direction would likely contain weird outcomes if the science progressed, something like "Dogs are barely-conscious due to their pack structure, they have a couple levels of recursive theory of mind but they can't sustain it as deep as we can. But cats didn't have that pack structure, they're not conscious at all." Or, "this person has such severe autism that he cannot fundamentally understand others' minds or what others interpret his mind to be, so we've downgraded his classification to unconscious. He'll talk your ear off about the various cars produced in a golden age between 1972 and 1984, but because he doesn't really know what it means for you to be listening we regard it as sleep-talking."

It also just kind of doesn't sound right. "What happens when we go to sleep? Well, we stop thinking about what others think we think, and we simply accept what they think about us." That doesn't sound like any sleep I experience -- it might describe some of my dreams, but of course dreams are anomalous conscious experiences that happen during sleep so that also misses the mark.

3 more replies

jacobsimon2y ago

You might find this book interesting! This is essentially the theory put forward. https://www.google.com/books/edition/Consciousness_and_the_S...

1 more reply

Davidzheng2y ago

If this is correct, do you think GPT5 will be conscious because its training data will include a lot of itself (albeit GPT4 not 5)

1 more reply

passion__desire2y ago

A Possible Evolutionary Reason for Why We Seem to Have Continuity of Consciousness and Personality.

Thesis : The very thing (a brain module) which allows for outside object continuity, that same brain module maintains inside self / personality / identity continuity.

Reasoning :

Evolution found out modelling the outside world is helpful for survival. Some eons later, it figured out modelling yourself (self / agent) modelling the outside world is also helpful. In the outside world, we keep track of continuity of objects through a brain module which hones in on the essence of objects (e.g. tracking a prey or predator.) so that EXACT matching algorithms aren't applied but ONLY approximate ones. As soon as a high enough approximate match (>95%) is found, we "register" it to be an exact match. i.e. the Brain bumps up the confidence level to 100%. This is also the reason why we consider our friend Bob to be the same childhood Bob even though he has different hairstyle, clothes, and other such properties. We don't call Bob who looks different than yesterday as an Imposter. The damage to this brain module could lead us to call Bob today an imposter. This same module also tracks continuity of self in a similar manner. Even though our "self" changes from childhood to adult we "register" "changing self" to be the same thing. i.e. internally we bump up the confidence to 100% when memories, etc. match and provide a coherent picture of the self. A multiple personality disorder is just different stable states of neural attractor states. Continuity is local to a personality but not global and hence transient. This local continuity of information could link up globally, giving rise to coherent single personality.

Capgras Syndrome :

Ramachandran Capgras Delusion Case : https://www.youtube.com/watch?v=3xczrDAGfT4

ctoth2y ago

> If you assume that "the eyes are the window to the soul", you notice some interesting properties.

And people say LLM output is nonsense.

I'm blind with glass eyeballs. Does this mean my soul is easier to access than yours? Or is it harder because there's something specific about the eyeball that makes it the window?

Merrill2y ago

Decision making seems fundamental to intelligence, is done by animals and humans, and can be done without the use of language or logic. This is the case when someone "decided without thinking".

Decision making requires imagination or the ability to envision alternative future states that may result from various choices.

Imagination is the start of abstract thinking. Consciousness results from the individual thinking abstractly about itself and how it interacts with the world.

dimal2y ago

I’ve been thinking along similar lines. It’s like with LLMs, they’ve created the part of the mind that is endlessly chattering, generating stories, sometimes true, sometimes false, but there’s no awareness or consciousness that ever steps back and can see thoughts as thoughts. And I don’t see how awareness or consciousness would arise from just more of the same (bigger models). It seems to be a fundamentally different part of the mind. I wonder if AGI is possible without this. AGI under some definition (good enough to replace most humans) may be possible. But it wouldn’t be aware. And without awareness, I don’t see how it could be aligned. It may appear to be aligned but then eventually it would probably get caught in a delusional feedback loop that it has no capacity to escape, because it can’t be aware of its own delusion.

hotpotamus2y ago

> It may appear to be aligned but then eventually it would probably get caught in a delusional feedback loop that it has no capacity to escape, because it can’t be aware of its own delusion.

I believe this is more or less the definition of human mental illness. I have to say that while I know it's really not possible, I wish people would stop pulling on these threads. I got into this line of work because I thought video games were cool, not because I wanted to philosophize about theories of mind and what intelligence is. I really don't like thinking about whether I'm just some sort of automaton made out of meat rather than metal and silicon.

pixl972y ago

Ah, the first releases of the Bing AI were fun here as they plunged into feedback loops of madness that were scarily human sounding. Thank you humanity for making artificial insanity.

hilux2y ago

> A chief goal of artificial intelligence is to build machines that think like people.

Maybe that's their goal.

But for many users of AI, the goal is to have easy and affordable access to a machine that, for some input (perhaps in a tightly constrained domain), gives us the output that we would expect from a high-functioning human being.

When I use ChatGPT as a coding helper, I really don't care about its "theory of mind." And its insights are already as deep (actually more deep) as I get from most humans I ask for help. Real humans, not Don Knuth, who is unavailable to help me.

dontupvoteme2y ago

Telling it to write like (or outright that it is) Knuth might just get it to write more efficient algorithms as it were.

IIRC one of the prompting techniques developed this year was to ask the model who are some world class experts in a field and then have it write as if it was that group collaborating on the topic.

marmaduke2y ago

> insights are already as deep (actually more deep) as I get from most humans I ask for help

This was my thought as well. But then I figured if I can't get someone to give me thoughtful feedback, I might have bigger problems to solve.

szundi2y ago

Or rather 30 different people on 30 different topics

krainboltgreene2y ago

Look this is the only time I'll engage in this sort of discussion on HN[1], but first Donald Knuth is a real Human and it's extremely weird to position world class experts as something otherworldly. Second, suppose you got what you wished for (you used the "us" pronoun), is that not a sentient mind that you're forcing to do your labour? Does that not raise a ton of red flags in your ethics?

[1] normally I find HN discussions about what if chatGPT is human or "humans are just autocompletes" to be highschool-level scifi and cringe respectively

hilux2y ago

I don't understand your objection about Don Knuth. I'm well aware who he is. My point is that I don't have access to that kind of insightful human helper. So I "settle" for ChatGPT.

And by "us" I mean "those of us who choose to use ChatGPT," and not that I was forcing you to use ChatGPT.

It's true, I don't morally object to asking ChatGPT to "do my labour." It raises no red flag for me. (Okay, there's the IP red-flag about how ChatGPT was trained, but I don't think that's what you mean.)

mr_toad2y ago

> is that not a sentient mind that you're forcing to do your labour? Does that not raise a ton of red flags in your ethics?

Maybe and maybe not. Being sentient or even sapient doesn’t mean it has human emotions, feelings or motivations. You have to be very careful when ascribing naturally evolved emotions and motivations to something that is not a naturally evolved intellect.

creer2y ago

Does it need "human emotions, feelings etc" to "deserve" protection and status? Human society answered no a long time ago. But still it will be interesting to see what protections are granted to entities that are unarguably also machines.

Barrin922y ago

No LLMs don't think like people, they're architecturally incapable of doing so. They have, physically unlike humans no access to their own internal state and they're, save for a small context window, static systems. They also have no insights. There's a hilarious video about LLM Jailbreaks by Karpathy[1] from a week ago, where he shows how you can break model responses by asking the same question with a base64 string, preceding the prompt with an image of a panda(???) or just random word salad.

LLM's are basically a validation of Searle's Chinese room. What they've proven is that you can build functioning systems that perform intelligent tasks purely at the level of syntax. But there is no (or very little) understanding of semantics. If I ask a person on how to end the world, whether I ask in French or English or base64 or perform a 50 word incantation beforehand likely does not matter. (unless of course the human is also just parroting an answer)

[1] https://youtu.be/zjkBMFhNj_g?t=2974

mcguire2y ago

I was right there with you until you mentioned Searle. :-)

The Chinese room argument is bad in that it hides an assumption of mind/body dualism. If you believe that humans have "souls" and other things do not, then you have a qualitative difference between a human or a machine. On the other hand, if you are a materialist then you are faced with the problem that humans don't have much understanding of semantics either. We're all chemical processes and it's hard for those to get much into semantics.

But then, the difference between LLMs and humans becomes quantitative, sort of, and since I cannot say that LLMs and humans are qualitatively different, the only argument I can find is that in my experience, LLMs have never responded in a way that leads me to believe that they are anything other than a statistical model of language. Humans, on the other hand, are not a statistical model of language.

Barrin922y ago

Searle's one of the most die-hard materialist philosophers of mind around. It's his materialism that leads him to make his argument. Computers and human brains are both made out of atoms but that doesn't mean they're not qualitatively different. By that logic I"d be no different from a tree. There's qualitative differences between computers and human brains. Our cognition is biochemical and horribly slow, just by virtue of speed we are not working like LLMs. We're not doing tensor math in our heads, we don't have access to terrabytes of unaltered, digital data.

It's because our bandwith and monkey brains are so slow that we're forced to operate at a level of semantics. We can't just make inferences from almost infinite amounts of data the same way we can't play chess like Stockfish or do math like a calculator. The dualism is precisely in the opposite view, that computation is somehow "substrate independent". Searle argues we can have AI that has understanding the way we do, just that it's going to look more like an organic brain as a result.

The important insight from LLMs is that they're not like us at all but that doesn't make them less effective or intelligent. We do have plenty of understanding, we need to because we rely on a particular kind of reasoning, but artificial systems don't need to converge on that.

int_19h2y ago

Computation is a purely physical phenomenon, so no, saying that sentience is computation that is substrate-independent is not dualism - it's hardline materialism. Dualism is saying that sentience cannot be entirely reduced down to physical phenomena.

mcguire2y ago

The Chinese Room gag asserts that, although the room behaves in every way like it is an intelligent Chinese speaker, we can see inside the room and determine that there is nothing there that intelligently understands Chinese.

Searle seems to see a distinction between the "substrates" (which means an LLM cannot be intelligent; it's running on regular computer hardware and there's nothing to be found which "understands"), but unless someone can point out exactly what part of the substrate is intelligent, I am going to continue to suggest that his substrate difference is exactly identical to having a soul or not having a soul.

I, as a materialist, assert that, if you dug through my noggin down to the level of subatomic particles you will never find anything that is intelligent or which understands. (Quantum mechanics does not help here, by the way. It just replaces a (rather naive) determinism with randomness---see Bell's Inequality.) There is no magic going on there. You and your tree are both doing nothing but chemistry and physics (you share something like 50% of your genes, by the way). And that means that a computer could, in principle, behave as intelligently as I do. That's "substrate independence".

Now, whether a given system does do so or not is another question entirely. :-)

calf2y ago

LLMs nevertheless contain rudimentary theories, don't they? Like the Othello example demonstrating a spatial model that is emergent.

LLMs are fast like calculators but it seems the optimization process that generates the parameter weights still produces "fuzzy semantics" and the Othello emergence is just one example.

pixl972y ago

While you right about LLMs, you're not really making the case for humans well at all.

Human insight is really easy to break, confidence men wouldn't really be a thing if it were hard to break. Simply putting a statement like "I love you" in front of a statement commonly overrides our intellect. Or offering a chocolate bar in trade of our passwords. If you want a human to tell you how to end the world, you'd just convince them to be your friend first.

TeMPOraL2y ago

> What they've proven is that you can build functioning systems that perform intelligent tasks purely at the level of syntax. But there is no (or very little) understanding of semantics.

That's not how LLMs work though, and I'm increasingly convinced that "syntax" and "semantics" are turning into annoyingly useless ideas people forget are descriptive, in the same way grammar books and dictionaries are descriptive.

My model of LLMs is that, in training, they're positioning inputs in an absurdly high-dimensional[0] latent space. That space is big enough to encode anything you'd consider "syntax" and "semantics" on some subset of dimensions. As a result, the model sidesteps the issue of "source of meaning" - there is no meaning but that formed through associations (proximity). This is pretty much how we do it, too - when you think of "chair", there is no token for platonic ideal of a chair in your mind. The word "chair" has meanings and connotations defined via other words, which themselves are defined via other words, ad infinitum, with the only grounding being associations to sensory inputs.

[0] - On the order of 100 000 dimensions for GPT-4, perhaps more now for GPT-4V / GPT-4-Turbo.

somewhereoutth2y ago

> big enough

Not even slightly. See Cantor's diagonal argument.

See also Plato's cave for syntax vs semantics.

1 more reply

int_19h2y ago

Searle's Chinese room is a good example of begging the question.

As for the rest of it, the LLM is basically "raw compute". You need a self-referential loop and long-term memories for it to even have the notion of self. But looking at it at that level and discounting it as "incapable of thinking" is missing the point - it's the larger system of which LLM is one part, albeit a key one (and which we're still trying to figure out how to build) that might actually be conscious etc.

melenaboija2y ago

Few weeks ago I did an experiment after a discussion here about LLMs and chess.

Basically inventing a board game and play against ChatGPT and see what happened. It was not able to do a single move, even having provided all the possible start moves in the prompt as part of the rules.

Not that I had a lot of hope about it, but it was definitely way worst than I expected.

If someone wants to take a look at it:

https://joseprupi.github.io/misc/2023/06/08/chat_gpt_board_g...

golergka2y ago

You haven't specified what model did you use, and the green ChatGPT icon in the shared conversation usually signifies GPT-3.5 model.

Here's my attempt at similar conversation — it seems GPT-4 is able to visualise the board and at least do a valid first move.

https://chat.openai.com/share/98427e21-678c-4290-aa8f-da8e93...

melenaboija2y ago

Interesting.

The model was whatever was up that that time, so probably was 3.5 if you say so.

xcv1232y ago

If you were using the free ChatGPT then you were playing with an obsolete LLM. GPT-4 has an estimated 10x parameters of GPT-3.5 (1.8 trillion vs 175 billion), and other improvements.

golergka2y ago

Your conversation is from June, GPT-4 was available for almost half a year at that point.

1 more reply

GaggiX2y ago

I have played some moves with GPT-4 and they seem right to me, what does this mean? That the model switched from not understanding to understanding, from unintelligent to intelligent? I don't think so, GPT-4 is just a more intelligent model than GPT-3.5 and it does understand more.

Also in this game if I don't move the queen I force a draw, right?

melenaboija2y ago

> what does this mean?

I don't know, take your own conclusions, I tried what I tried with the results I got. And the reason I created a Monte Carlo Engine to play the game was specifically because of this, I expected ChatGPT to be able to make moves but actually not being good with the game. You can try yourself, the code is available.

> Also in this game if I don't move the queen I force a draw, right?

I don't know as there is no time but I assume it is mandatory to move when, what happens in a chess game with no time if someone does not want to move? Same applies here.

GaggiX2y ago

>I don't know, take your own conclusions

The API cost of the game is getting noticeable, but I think you were just being naive about LLM limitations, there is simply no way that it can answer all questions simply by memorization. A simpler way is to just invent a programming language and ask the model to solve problems with it, at least I don't have to write down the position of a game ahah

Also I have trained models to do additions in the past, I removed many possible combinations of digits to show in the dataset, but after training the model was able to solve all of them, meaning it learned the algorithm and not just memorized the answers, I did it because a friend of mine thought like you that LLMs just memorize the answers from the dataset and cannot learn, but that is not how they work.

About the game, I realize that I cannot move the queen like in chess, so in this game I will eventually fall into a zugzwang, trying not to move the queen.

1 more reply

FrustratedMonky2y ago

I'm older.

I've bought 'new' board games for kids.

Then, I have been un-able to play because the instructions were pretty bad.

Humans also need to 'learn'. Need a few play-throughs.

No human is going out and 'in a vacuum' with no experience, buying Risk and from scratch, read instructions and play perfect game winning strategy.

melenaboija2y ago

The thing is that I wanted to prove that ChatGPT was not able to learn from the rules and that is indeed a Language Model that puts one token after the other, if it know how to play chess it is because it has seen games in the past, as I say in my post:

> If it is not memorizing, how do you think is doing it? (me)

> by trying to learning the general rules that to explain the dataset and minimize its loss. That’s what machine learning is about, it’s not called machine memorizing.

xcv1232y ago

> Language Model that puts one token after the other,

The interesting thing about the one-token-at-a-time process in OpenAI transformer LLMs is how the Attention Mechanism is executing ~1600 processes in parallel over the entire context window for each new token generated. So it is dynamically re-evaluating the entire context (including the rules of the game) in relation to the next token at each step.

resters2y ago

Here's my theory:

Consider a typical LLM token vector used to train and interact with an LLM.

Now imagine that other aspects of being human (sensory input, emotional input, physical body sensation, gut feelings, etc.) could be added as metadata to the the token stream, along with some kind of attention function that amplified or diminished the importance of those at any given time period -- all still represented as a stream of tokens.

If an LLM could be trained on input that was enriched by all of the above kind of data, then quite likely the output would feel much more human than the responses we get from LLMs.

Humans are moody, we get headaches, we feel drawn to or repulsed by others, we brood and ruminate at times, we find ourselves wanting to impress some people, some topics make us feel alive while others make us feel bored.

Human intelligence is always colored by the human experience of obtaining it. Obviously we don't obtain it by getting trained on terabytes of data all at once disconnected from bodily experience.

Seemingly we could simulate a "body" and provide that as real time token metadata for an LLM to incorporate, and we might get more moodiness, nostalgia, ambition, etc.

Asking for a theory of mind is in fact committing the Cartesian error of making a mind/body distinction. What is missing with LLMs is a theory of mindbody... similarity to spacetime is not accidental as humans often fail to unify concepts at first.

LLMs are simply time series predictors that can handle massive numbers of parameters in a way that allows them to generate corresponding sequences of tokens that (when mapped back into words) we judge as humanlike or intelligence-like, but those are simply patterns of logic that come from word order, which is closely related in human languages to semantics.

It's silly to think that we humans are not abstractly representable as a probabilistic time series prediction of information. What isn't?

calf2y ago

So my observation is that we could embody an AI so that it learns theory of mind-body--but then we could remove the body. This gives a theory of mindful entity that does not need a body to exist.

Then the next research step could be to study those properties so as to reconstruct/reproduce a theory of mind-body AI, without needing any embodiment process at all to obtain it. Is that, in principle, possible? It is unclear me.

resters2y ago

> we could embody an AI

... a hardware interface that generates a token stream from a living human's body would seem to enable this at some level.

Not sure how it would work at scale. Maybe something much simpler like phones with built-in VOC sensors that can detect nuances of the user's perspiration, combined with real time emotion sensing via gait, voice, along with metadata that is already available would be sufficient to produce such a token stream... who knows.

1 more reply

theptip2y ago

This is a terrible eval. Do not update your beliefs on whether LLMs have Theory of Mind based on this paper.

The eval is a weird, noisy visual task (picture of astronaut with “care packages”). Their results are hopelessly narrow.

A better eval is to use actual scientifically tested psychology test on text (the native and strongest domain for LLMs), for example the sort of scenarios used to gauge when children develop theory of mind (“Alice puts her keys on the table then leaves the room. Bob moves the keys to the drawer. Alice returns. Where does she think the keys are?”) which GPT-4 can handle easily; it is very clear from this that GPT has a theory of mind.

A negative result doesn’t disprove capabilities; it could easily show your eval is garbage. Showing a robust positive capability is a more robust result.

tivert2y ago

> A better eval is to use actual scientifically tested psychology test on text (the native and strongest domain for LLMs), for example the sort of scenarios used to gauge when children develop theory of mind (“Alice puts her keys on the table then leaves the room. Bob moves the keys to the drawer. Alice returns. Where does she think the keys are?”) which GPT-4 can handle easily; it is very clear from this that GPT has a theory of mind.

Aren't you confusing having a theory of mind with being able to output the right answer to a test? Isn't your proposed evaluation especially problematic because an "actual scientifically tested psychology test" is likely in the training data along with a lot of discussion and analysis of that test and the correct and incorrect answers that can be given?

theptip2y ago

Of course when testing an LLM you do not use the exact text from studies in the training set. There are an infinite number of structural permutations that preserve the methodology sufficiently to be comparable to the existing literature, while being distinct enough to be outside the training distribution.

Of course, another approach you can take if you suspect contamination is to ask follow-up questions that are less likely to be in any training data; if the LLM is a mere stochastic parrot without a ToM it will not be able to give well-formed answers to follow-up questions.

Perhaps if I'd said "established psychology research methodology" the intent would have been clearer?

famouswaffles2y ago

How do I know you have theory of mind or are concious if not the "right" response to a test ?

As far as I'm concerned, the only person I can be certain is concious is me.

It doesn't have to be a "scientifically tested psychology test"

Construct your own story with multiple characters of varying knowledge and beliefs and see how it does.

1 more reply

elicksaur2y ago

Or there are enough of those examples in the training set that it can guess well. Not sure how such an example would prove anything when we know an LLM is just guessing the best words.

Nothing I’ve seen shows evidence of any sort of abstract concepts in there.

pixl972y ago

Wouldn't this also be the same for humans?

2 more replies

stuckinhell2y ago

Do humans have that as well ? I read studies that suggest we make up consciousness a half second after something happened.

omginternets2y ago

We don't "make up" consciousness, but yes, there is a processing latency of around 250-300ms.

moffkalast2y ago

I think they may be referring to the principle task that consciousness serves in humans, which is to rationalize decisions we've already made subconsciously to other people so they will help us.

The conscious "why" comes after the decision. In that sense it's exactly the kind of bullshit machine that LLMs are.

1 more reply

FrustratedMonky2y ago

In the book "Being You" Anil Seth. It does postulate that we make up consciousness.

The brain is trying to 'predict' the next sensory input, and that prediction is our awareness. What we would call our 'conscious self'.

It makes point of calling it a 'controlled hallucinations', in that what we experience as our self. "Hallucination" being the experience we have as our brain 'predicting/controlling' for the sensory input. So All inputs come together in a 'hallucination', but it is averaged 'Bayesian', with the actions we are taking at same time. So Action + Prediction = Self.

It is funny that using the word 'hallucinate' in AI has become so common and it is also used in Humans. And so few people seem to make connection that they are actually very similar, and far from being an argument against AI consciousness, is argument for how similar they are.

omginternets2y ago

If you want to equate "emergence" with "making something up", then fine, I guess. I'm just not sure what what you can possibly conclude from that equivalence.

1 more reply

stuckinhell2y ago

yep that's what I was referring too!

stcredzero2y ago

Bad liars seem to have difficulty with theory of mind. Sometimes ChatGPT comes across somewhat like this.

ape42y ago

Alternately good liars probably have a solid theory of mind. You need to tell the other person what they are likely to believe so you need to know how they think.

KineticLensman2y ago

> good liars probably have a solid theory of mind

That, and confidence, and ideally a good memory so that they can keep track of what they have previously said to someone.

1 more reply

JonChesterfield2y ago

The fun question is whether human cognition similarly lacks deep insights or said theory of mind.

I perceive a moving of the goalposts as machine intelligence improves. Once we'd have been happy with smarter than an especially stupid person, now I think we're aiming at smarter than the smartest person.

swatcoder2y ago

> I perceive a moving of the goalposts as machine intelligence improves.

Goal posts only exist in games.

These systems are engineering products to be leveraged in enginenering processes. We want to understand what they're good at and what they're bad at, and what potential they show for further refinement. There are no goal posts or "happy with" criteria in that context, and when we find ourselves adjusting the language we use to describe them because of how we see them work, we're trying to refine our ability to express their capabilities and suitabilities.

Intelligence, in particular, is a very poor and ambiguous word to be stuck using in technical contexts and so we're likely to just gradually shed it over time to reduce confusion as we hone in on better ways to talk about these systems. We've repeatedly done the same for earlier advances in the field, and for the same reason.

stcredzero2y ago

I perceive a moving of the goalposts as machine intelligence improves.

We get a better and better idea of what this hazy term "intelligence" means as we DIY tinker with making our own new ones.

Once we'd have been happy with smarter than an especially stupid person, now I think we're aiming at smarter than the smartest person.

We're going to get there sooner than we think. When we get there, we will have new things to regret in ways we'd never thought of before.

Filligree2y ago

> We're going to get there sooner than we think. When we get there, we will have new things to regret in ways we'd never thought of before.

I'll take that. My own expectation is I'll have a few minutes-to-months to say "I told you so".

mcguire2y ago

I believe those goalposts have always been way farther out than many people think. If you look at the discussion around Turing's original Imitation Game paper, you'll find people wanting the machine to be able to do things that most humans cannot. And its perfectly valid to do so.

If you regard "an especially stupid person" as someone with significant cognitive or communication limits, then Parry and Eliza's Doctor are pretty fair simulations of paranoid schizophrenia (as it was understood at the time) and Rogerian therapy. Likewise, chess and go AIs are pretty damn smart, except they can't do anything else.

The point is that, if you accept limits on what the machine needs to do, then "intelligence" as defined by behavior you can recognize becomes trivially and meaninglessly easy.

(It's sort of like evaluating a person's competence: a minority person has to be more competent than their cohort because non-minority people get the benefit of the doubt.)

vacuity2y ago

I think it has to do with the notion that many (most?) people who could hone, and employ, respectable cognitive skills neglect or refuse to do so in favor of putting down other species and LLMs. They point to the human exemplars and think having the same DNA template elevates them to that level. They have to be superior even if it means applying ridiculous biases around intelligence.

Animats2y ago

Not yet, no. The real question is whether a bigger version of the current technology will have deeper insights. That question should be answered within the next year, with the amount of money and GPU hardware being thrown at the problem.

ehsanu12y ago

Has the title of the paper changed from what it was initially? It says "Have we built machines that think like people?" now, whereas the HN title is "Large language models lack deep insights or a theory of mind".

mdp20212y ago

> A chief goal of artificial intelligence [would be] to build machines that think like people

"A chief goal of levers (cranes, etc.) engineering would be to build devices that lift like people"

educaysean2y ago

Well we are the most intelligent species known to us as of now. Of course it would be considered the holy grail of simulated intelligence.

mdp20212y ago

Reality instead is, that humans have a potential for Intelligent thought and action, and that they overwhelmingly avoid it with clamorous failures. To simulate that routine would be the opposite of the goal.

There is gain in implementing desirable qualities. Just that.

natch2y ago

“vision-based” large language models.

Odd restriction. Why not investigate text-based ones?

Or is “vision-based” a technical term that encompasses models that were trained on text?

rf152y ago

I work in the field. It's just not how text-token-based autoregressive models can ever work. I can't talk about my work of course, but even a quick glance on Wikipedia can tell you they'd need to be at least a symbolic hybrid, which is not being pursued(?) by the big players at the time.

aaroninsf2y ago

It is refreshing that the author's language expresses their findings as indicative of domains for attention and presumed improvement, rather than (as so is often the case, per Ximm's Law) making pronouncements which preclude such improvement!

bimguy2y ago

"Large language models lack deep insights or a theory of mind"

Funnily enough, this statement also applies to people that are scared of AI.

Maybe a bit off topic but does anyone else have that friend who sends them fear mongering AI videos with captions like "shocking AI" that are blatantly unimpressive or completely fake?

What is the best way to subdue this kind of fear in a friend, sending them written articles from high level researchers like Brooks does not work.

gumballindie2y ago

I dont know what’s worse. The fact that there are people who believe procedural text generators have insights and a theory of mind or the fact that we are taking them seriously and we need to publish papers to disprove their insanity.

huijzer2y ago

EDIT: Nevermind

AlecSchueler2y ago

Why does it have to remain the case or "age well" to be valid? They're studying the situation today.

curiousgal2y ago

No shit Sherlock!

verytrivial2y ago

I was having a drunken discussion with the philosophy lecturer a few weeks back. He was making a very similar point. I kept saying it does it really matter? Lacking a theory of mind and deep insights describes 90% of all perfectly normal people. And perhaps training will be able to "fake it" (he went off on bold tangents about the definitions of this and that), or the language model will be an adjunct to some other model which does have these insights encoded or deducible, much like the human mind does. He wasn't convinced and I was too drunk. But it was basically feeling like: You can't feed carrots to a car like you can a horse, therefore cars are worthless.

dweinus2y ago

Of course they don't! But I think the most fascinating and exciting part about LLMs is: a sufficiently large model can produce things that look a lot like cognition, without having it at all. That is shocking and suggests maybe AGI is not even goal worth hitting.

verytrivial2y ago

Exactlyyyy-ish. I'm firmly in the camp that believes that there is no magic central kernel to consciousness or intelligence, that it's an emergent property. We are seeing partial emergence now and it's only going to get more complex.

Edit: this is may turning to a search for truth and definitions of reality question. When the last person alive is no longer able to tell whether they're speaking to an AI, does it actually matter whether it's true generalized intelligence or just an emergent approximation?

bloppe2y ago

To get into this analogy: this doesn't mean cars are worthless; it just means they're a poor approximation of a horse. Maybe you don't want to approximate a horse. But, if you do want to approximate a horse, don't try to do it with a car.

Similarly, if you want to approximate a human, an LLM may be the best we can do right now, but it's hardly a good approximation.

verytrivial2y ago

Well the analogy was more at the introduction of the automobile the people who were familiar with horses were able to point out all the ways that horses were better than cars by some measure. Cars ended up being used in entirely different and arguably more powerful ways. You didn't even need to contradict the people who held the horses in higher reguard. Horses just became irrelevant. It's an incremental value proposition. AI will keep hitting various plateaus, but it's already pretty fucking amazing. It's not going to get worse. And pointing out specifically how it differs from the human mind to me honestly feels like clinging to the wreckage.

j / k navigate · click thread line to collapse