undefined | Better HN

0 pointsrcxdude1y ago0 comments

It's more that if you actually work with LLMs they will display reasoning. It's not particularly good or deep reasoning (I would generally say they have a superhuman amount of knowledge but are really quite unintelligent), but it is more than simply recall.

0 comments

11 comments · 4 top-level

d4mi3n1y ago· 5 in thread

Waters are often muddied here by our own psychology. We (as a species) tend to ascribe intelligence to things that can speak. Even more so when someone (or thing in this case) can not just speak, but articulate well.

We know these are algorithms, but how many people fall in love or make friends over nothing but a letter or text message?

Capabilities for reasoning aside, we should all be very careful of our perceptions of intelligence based solely on a machines or algorithms apparent ability to communicate.

glenstein1y ago

>we should all be very careful of our perceptions of intelligence based solely on a machines or algorithms apparent ability to communicate.

I don't think that's merely an irrational compulsion. Communication can immediately demonstrate intelligence, and I think it quite clearly has, in numerous ways. The benchmarks out there cover a reasonable range of measurements that aren't subjective, and there's clear yes-or-no answers to whether the communication is showing real ways to solve problems (e.g. change a tire, write lines of code, solving word problems, critiquing essays), where the output proves it in the first instance.

Where there's an open question is in whether you're commingling the notion of intelligence with consciousness, or identifying intelligence with AGI, or with "human like" uniqueness, or some other special ingredient. I think your warning is important and valid in many contexts (people tend to get carried away when discussing plant "intelligence", and earlier versions of "AI" like Eliza were not the real deal, and Sophia the robot "granted citizenship" was a joke).

But this is not a case, I think where it's a matter of intuitions leading us astray.

d4mi3n1y ago

> Where there's an open question is in whether you're commingling the notion of intelligence with consciousness

I’m absolutely commingling these two things and that is an excellent point.

Markov chains and other algorithms that can generate text can give the appearance of intelligence without any kind of understanding or consciousness.

I’m not personally certain of consciousness is even requisite for intelligence, given that as far as we know consciousness is an emergent property stemming from some level of problem solving ability.

kmmlng1y ago

This seems like the classic shifting of goalposts to determine when AI has actually become intelligent. Is the ability to communicate not a form of intelligence? We don't have to pretend like these models are super intelligent, but to deny them any intelligence seems too far for me.

d4mi3n1y ago

My intent was not to claim communication isn’t a sign of intelligence, but that the appearance of communication and our tendency to anthropomorphize behaviors that are similar to ours can result in misunderstandings as to the current capabilities of LLMs.

glenstein made a good point that I was commingling concepts of intelligence and consciousness. I think his commentary is really insightful here: https://news.ycombinator.com/item?id=42912765

daveguy1y ago

AI certainly won't be intelligent while it has episodic responses to queries with no ability to learn from or even remember the conversation without it being fed back through as context. This is the current case for LLM models. Token prediction != Intelligence no matter how intelligent it may seem. I would say adaptability is a fundamental requirement of intelligence.

3 more replies

fatbird1y ago· 2 in thread

Are they displaying reasoning, or the outcome of reasoning, leading you to a false conclusion?

Personally, I see ChatGPT say "water doesn't freeze at 27 degrees F" and think "how can it possibly do advanced reasoning when it can't do basic reasoning?"

rcxdudeOP1y ago

I'm not saying it reasons reliably, at all (nor has much success with anything particularly deep: I think in a lot of cases it's dumber than a lot of animals in this respect). But it does a form of general reasoning which other more focused AI efforts have generally struggled with, and it's a lot more successful than random chance. For example, see how ChatGPT can be persuaded to play chess. It still will try to make illegal moves sometimes, hallucinating pieces in the board state or otherwise losing the plot. But if you constrain it and only consider the legal moves, it'll usually beat the average person (i.e. someone who understands the rules but has very little experience), even if it'll be trounced by an experienced player. You can't do this just by memorisation or random guessing: chess goes off-book (i.e. into a game state that has never existed before) very quickly, so it must have some understanding of chess and how to reason about the moves to make, even if it doesn't color within the lines as well as a comparatively basic chess engine.

(Basically, I don't think there's a bright line here: saying "they can't reason" isn't very useful, instead it's more useful to talk about what kinds of things they can reason about, and how reliably. Because it's kind of amazing that this is an emergent behaviour of training on text prediction, but on the other hand because prediction is the objective function of the training, it's a very fuzzy kind of reasoning and it's not obvious how to make it more rigourous or deeper in practice)

fatbird1y ago

This is the most pervasive bait-and-switch when discussing AI: "it's general reasoning."

When you ask an LLM "what is 2 + 2?" and it says "2 + 2 = 4", it looks like it's recognizing two numbers and the addition operation, and performing a calculation. It's not. It's finding a common response in its training data and returning that. That's why you get hallucinations on any uncommon math question, like multiplying two random 5 digit numbers. It's not carrying out the logical operations, it's trying to extract the an answer by next token prediction. That's not reasoning.

When you ask "will water freeze at 27F?" and it replies "No, the freezing point of water is 32F", what's happening is that it's not recognizing the 27 and 32 are numbers, that a freezing point is an upper threshold, and that any temperature lower than that threshold will therefore also be freezing. It's looking up the next token and finding nothing about how 27F is below freezing.

Again, it's not reasoning. It's not exercising any logic. Its huge training data set and tuned proximity matching helps it find likely responses, and when it seems right, that's about the token relationship pre-existing in the training data set.

That it occasionally breaks the rules of chess just shows it has no concept of those rules, only that the next token for a chess move is most likely legal because most of its chess training data is of legal games, not illegal moves. I'm unsurprised to find that it can beat an average player if it doesn't break the rules: most chess information in the world is about better than average play.

If an LLM came up with a proof no one had seen, but it checks out, that doesn't prove it's reasoning either, just because it's next token prediction that came up with it. It found token relationships no one had noticed before, but that's inherent in the training data, and not a reflective intelligence doing logic.

When we discuss things like reinforcement learning and chain of reasoning, what we're really talking about are ways of restricting/strengthening those token relationships. It's back-tuning of the training data. Still not doing logic.

2 more replies

whatshisface1y ago

I don't think any of us are qualified to tell the difference between exhibiting reasoning and mixing examples taken from the entire internet. Maybe if the training data was small enough to comprehend in its entirety, we could say one way or the other, but as it stands none of us have read the enitre internet, and we have no way of finding the stackoverflow or Reddit conversation that most closely resembles a given chain of thought.

abecedarius1y ago

Yes, my judgement too from messing with Claude and (previously) ChatGPT. 'Ridiculous' and 'cultish' are overton-window enforcement more than they are justified.

j / k navigate · click thread line to collapse

0 comments

11 comments · 4 top-level

d4mi3n1y ago· 5 in thread

We know these are algorithms, but how many people fall in love or make friends over nothing but a letter or text message?

Capabilities for reasoning aside, we should all be very careful of our perceptions of intelligence based solely on a machines or algorithms apparent ability to communicate.

glenstein1y ago

>we should all be very careful of our perceptions of intelligence based solely on a machines or algorithms apparent ability to communicate.

But this is not a case, I think where it's a matter of intuitions leading us astray.

d4mi3n1y ago

> Where there's an open question is in whether you're commingling the notion of intelligence with consciousness

I’m absolutely commingling these two things and that is an excellent point.

Markov chains and other algorithms that can generate text can give the appearance of intelligence without any kind of understanding or consciousness.

kmmlng1y ago

d4mi3n1y ago

glenstein made a good point that I was commingling concepts of intelligence and consciousness. I think his commentary is really insightful here: https://news.ycombinator.com/item?id=42912765

daveguy1y ago

3 more replies

fatbird1y ago· 2 in thread

Are they displaying reasoning, or the outcome of reasoning, leading you to a false conclusion?

Personally, I see ChatGPT say "water doesn't freeze at 27 degrees F" and think "how can it possibly do advanced reasoning when it can't do basic reasoning?"

rcxdudeOP1y ago

fatbird1y ago

This is the most pervasive bait-and-switch when discussing AI: "it's general reasoning."

2 more replies

whatshisface1y ago

abecedarius1y ago

Yes, my judgement too from messing with Claude and (previously) ChatGPT. 'Ridiculous' and 'cultish' are overton-window enforcement more than they are justified.

j / k navigate · click thread line to collapse