undefined | Better HN

0 pointsnox1011y ago0 comments

It sounds like you think this research is wrong? (it claims llms can not reason)

https://arstechnica.com/ai/2024/10/llms-cant-perform-genuine...

or do you maybe think no logical reasoning is needed to do everything a human can do? Tho humans seem to be able to do logical reasoning

0 comments

13 comments · 3 top-level

bbor1y ago· 9 in thread

I’ll pop in with a friendly “that research is definitely wrong”. If they want to prove that LLMs can’t reason, shouldn’t they stringently define that word somewhere in their paper? As it stands, they’re proving something small (some of today’s LLMs have XYZ weaknesses) and claiming something big (humans have an ineffable calculator-soul).

LLMs absolutely 100% can reason, if we take the dictionary definition; it’s trivial to show their ability to answer non-memorized questions, and the only way to do that is some sort of reasoning. I personally don’t think they’re the most efficient tool for deliberative derivation of concepts, but I also think any sort of categorical prohibition is anti-scientific. What is the brain other than a neural network?

Even if we accept the most fringe, anthropocentric theories like Penrose & Hammerhoff’s quantum tubules, that’s just a neural network with fancy weights. How could we possibly hope to forbid digital recreations of our brains from “truly” or “really” mimicking them?

ddingus1y ago

Can they reason, or is the volume of training data sufficient for them to match relationships up to appropriate expressions?

Basically, if humans have had meaningful discussions about it, the product of their reasoning is there for the LLM, right?

Seems to me, the "how many R's are there in the word "strawberry" problem is very suggestive of the idea LLM systems cannot reason. If they could, the question is not difficult.

The fact is humans may never have actually discussed that topic in any meaningful way captured in the training data.

And because of that and how specific the question is, the LLM has no clear relationships to map into a response. It just does best case, whatever the math deemed best.

Seems plausible enough to support the opinion LLM'S cannot reason.

What we do know is LLMs can work with anything expressed in terms of relationships between words.

There is a ton of reasoning templates contained in that data.

Put another way:

Maybe LLM systems are poor at deduction, save for examples contained in the data. But there are a ton of examples!

So this is hard to notice.

Maybe LLM systems are fantastic at inference! And so those many examples get mapped to the prompt at hand very well.

And we do notice that and see it like real thinking, not just some horribly complex surface containing a bazillion relationships...

chongli1y ago

The “how many R’s are in the word strawberry?” problem can’t be solved by LLMs specifically because they do not have access to the text directly. Before the model sees the user input it’s been tokenized by a preprocessing step. So instead of the string “strawberry”, the model just sees an integer token the word has been mapped to.

2 more replies

tsimionescu1y ago

> Even if we accept the most fringe, anthropocentric theories like Penrose & Hammerhoff’s quantum tubules, that’s just a neural network with fancy weights.

First, while it is a fringe idea with little backing it, it's far from the most fringe.

Secondly, it is not at all known that animal brains are accurately modeled as an ANN, any more so than any other Turing-compatible system can be modeled as an ANN. Biological neurons are themselves small computers, like all living cells in general, with not fully understood capabilities. The way biological neurons are connected is far more complex than a weight in an ANN. And I'm not talking about fantasy quantum effects in microtubules, I'm talking about well-established biology, with many kinds of synapses, some of which are "multicast" in a spatially distinct area instead of connected to specific neurons. And about the non-neuronal glands which are known to change neuron behavior and so on.

How critical any of these differences are to cognition is anyone's guess at this time. But dismissing them and reducing the brain to a bigger NN is not wise.

Koala_ice1y ago

There's a lot of other interesting biology besides propagation of electrical signals. Examples include: 1/ Transport of mRNAs (in specialized vesicle structures!) between neurons. 2/ Activation and integration of retrotransposons during brain development (which I have long hypothesized acts as a sort of randomization function for the neural field). 3/ Transport of proteins between and within neurons. This isn't just adventitious movement, either - neurons have a specialized intracellular transport system that allows them to deliver proteins to faraway locations (think >1 meters).

adrianN1y ago

It is my understanding that Penrose doesn’t claim that brains are needed for cognition, just that brains are needed for a somewhat nebulous „conscious experience“, which need not have any observable effects. I think that it’s fairly uncontroversial that a machine can produce behavior that is indistinguishable from human intelligence over some finite observation time. The Chinese room speaks Chinese, even if it lacks understanding for some definitions of the term.

1 more reply

shkkmo1y ago

> LLMs absolutely 100% can reason, if we take the dictionary definition; it’s trivial to show their ability to answer non-memorized questions, and the only way to do that is some sort of reasoning.

Um... What? That is a huge leap to make.

'Reasoning' is a specific type of thought process and humans regularly make complicated decisions without doing it. We uses hunches and intuition and gut feelings. We make all kinds of snap assessments that we don't have time to reason through. As such, answering novel questions doesn't necessarily show a system is capable of reasoning.

I see absolutely nothing resumbling an argument for humans having an "ineffable calculator soul", I think that might be you projecting. There is no 'categorical prohibition', only an analysis of the current flaws of specific models.

Personally, my skepticism about imminent AGI has to do believing we may be underestimating the complexity of the software running on our brain. We've reached the point where we can create digital "brains", or atleast portions of them. We may be missing some other pieces of a digital brain, or we may just not have the right software to run on it yet. I suspect it is both but that we'll have fully functional digital brains well before we figure out the software to run on them.

bbor1y ago

All well said, and I agree on many of your final points! But you beautifully highlighted my issue at the top:

  'Reasoning' is a specific type of thought process

If so, what exactly is it? I don’t need a universally justified definition, I’m just looking for an objective, scientific one. A definition that would help us say for sure that a particular cognition is or isn’t a product of reason.

I personally have lots of thoughts on the topic and look to Kant and Hegel for their definitions of reason as the final faculty of human cognition (after sensibility, understanding, and judgement), and I even think there’s good reason (heh) to think that LLMs are not a great tool for that on their own. But my point is that none of the LLM critics have a definition anywhere close to that level of specificity.

Usually, “reason” is used to mean “good cognition”, so “LLMs can’t reason” is just a variety of cope/setting up new goalposts. We all know LLMs aren’t flawless or infinite in their capabilities, but I just don’t find this kind of critique specific enough to have any sort of scientific validity. IMHO

2 more replies

visarga1y ago

Chasing our own tail with concepts like "reasoning". Let's move the concept a bit - "search". Can LLMs search for novel ideas and discoveries? They do under the right circumstances. You got to provide idea testing environments, the missing ingredient. Search and learn, it's what humans do and AI can do as well.

The whole issue with "reasoning" is that is an incompletely defined concept. Over what domain, what problem space, and what kind of experimental access do we define "reasoning"? Search is better as a concept because it comes packed with all these things, and without conceptual murkiness. Search is scientifically studied to a greater extent.

I don't think we doubt LLMs can learn given training data, we already accuse them of being mere interpolators or parrots. And we can agree to some extent the LLMs can recombine concepts correctly. So they got down the learning part.

And for the searching part, we can probably agree its a matter of access to the search space not AI. It's an environment problem, and even a social one. Search is usually more extended than the lifetime of any agent, so it has to be a cultural process, where language plays a central role.

When you break reasoning/progress/intelligence into "search and learn" it becomes much more tractable and useful. We can also make more grounded predictions on AI, considering the needs for search that are implied, not just the needs for learning.

How much search did AlphaZero need to beat us at go? How much search did humans pack in our 200K years history over 10,000 generations? What was the cost of that journey of search? That kind of questions. In my napkin estimations we solved 1:10000 of the problem by learning, search is 10000x to a million times harder.

shkkmo1y ago

You can't breakdown cognition into just "search" and "learn" without either ridiculously overloading those concepts or leaving a ton out.

astrange1y ago· 1 in thread

It says "current" LLMs can't "genuinely" reason. Also, one of the researchers then posted an internship for someone to work on LLM reasoning.

I think the paper should've included controls, because we don't know how strong the result is. They certainly may have proven that humans can't reason either.

mannykannot1y ago

If they had human controls, they might well show that some humans can’t do any better, but based on how they generated test cases, it seems unlikely to me that doing so would prove that humans cannot reason (of course, if that’s actually the case, we cannot trust ourselves to devise, execute and interpret these tests in the first place!)

Some people will use any limitation of LLMs to deny there is anything to see here, while others will call this ‘moving the goalposts’, but the most interesting questions, I believe, involve figuring out what the differences are, putting aside the question of whether LLMs are or are not AGIs.

CSMastermind1y ago

The later.

While I generally do suspect that we need to invent some new technique in the realm of AI in order for software to do everything a human can do, I use analogies like chess engines to caution myself from certainty.

j / k navigate · click thread line to collapse