undefined | Better HN

0 pointsmdp20211y ago0 comments

> When you say "predicting facts" you imply "predicting true future events"

And Michelson and Morley did through Einstein's theory. And Jack did when he said "if my theory is correct, that falling brick will break my skull more probably than not". And it's a matter in which LLMs tend to fail, when they go "surely your operating system will have a `scratchmyback` command to allow you to work more hours sitting in front of it, it just makes sense".

> How do you figure [that «LLMs seem to be dramatically bad at "procedures"»], and how did you reach this conclusion?

I just tried with a main widespread engine, and it failed. And it showed that it still seemed to be guessing an output instead of actually checking to build the output (as if remembering that very often "2+2=4" instead of checking "1 and 1, and 1 and 1: 1, 2, 3, 4").

0 comments

A_D_E_P_T1y ago

Here's the issue: Prediction isn't only about performing experiments in science, or engineering tasks. It's an ongoing process and something that may very well be tied to our very existence as conscious observers, in that it extends our spatiotemporal sense.

Forget Einstein for a minute. When you drive a car, you hold a mental model of your position and velocity in time and space, of the expected behaviors of other drivers, of the conditions of the road, and you continually adjust your behavior in accordance with that model. Almost anything that requires attention is something that requires us to build a mental model of the future -- and predict that future.

So, yeah, you can hew closely to validated scientific theories and "predict" how things will happen in that sense. But, as you walk home from your meeting at the astronomical society, you stop at a crosswalk, look both ways, and you're back to making essentially probabilistic predictions about how crossing the road is going to go.

I get the sense that you dislike them, but really LLMs are not so different. How they handle probability and prediction is different in degree, but I don't think that it's entirely different in kind.

> And it showed that it still seemed to be guessing an output instead of actually checking to build the output (as if remembering that very often "2+2=4" instead of checking "1 and 1, and 1 and 1: 1, 2, 3, 4").

You've never memorized your multiplication tables?

Boss Terry Tao has a reasonably high opinion of the abilities of LLMs as mathematicians, which is remarkable -- really astounding -- considering how they're built and trained, as essentially language prediction and manipulation machines.

mdp2021OP1y ago

> "predict"

I must stress that the idea of "Science predicting facts" is a consolidated formula in Philosophy of Science.

And there has never been a doubt that prediction is probabilistic. But, see the example in in the parallel additional post about "dreaming and wake", the predicting activities of a junkie under psychedelics and that of a lucid thinker are substantially different.

> You've never memorized

You have the framework very very wrong: the point is not that we memorize, the point is that those LLMs don't check. When you state an idea, you are supposed to have checked it in other occasions before memorization.

Procedural operations, of which counting is just an example, can fail in those LLMs, which means they are simulating it instead of doing it, which suggests that they «seem to be guessing an output instead of actually checking to build the output», which makes them structurally untrustworthy, unreliable - broken by design.

Being black boxes (bad), they must be stress tested to see whether proper functioning is present or just simulated: the chief problem is not that they can't count, it is that they must be missing the roots of counting: procedural lucid thinking.

Check the parallel submission about the detective game ("Temporal Clue")*: an algorithm that cannot fully reason with a lucid world model, solving logic puzzles, is unreliable. The probabilistic nature of the architecture in this case is below the intelligent, as opposed to the sophistication of considering less probable unexpected branches of possibilities.

* https://news.ycombinator.com/item?id=43284420

A_D_E_P_T1y ago

> I must stress that the idea of "Science predicting facts" is a consolidated formula in Philosophy of Science.

Respectfully, I'd suggest that you are misinterpreting it or using the wrong terminology. Science is not a thing, it is a process: A hypothesis is a prediction about the world, which is validated or disproven via experiment. A validated hypothesis -- like Newton's physics -- is a model for how the world works, which may later be superseded by more accurate models. Newton's physics, though a great stride in our understanding of the world, is not a fact, instead it is an approximation of reality.

> * the predicting activities of a junkie under psychedelics and that of a lucid thinker are substantially different.*

There's also a substantial difference between the predicting activities of a cat and those of a man.

Scratch the surface, though, and the same type of thing is happening.

Of course LLMs don't predict things exactly as you do. But at what they were trained to do -- in much the same way a cat was "trained" by long eons to hunt mice -- they're extremely capable, and they're extensible and capable of abstraction much as humans are, and much unlike cats. It's not even clear that, in the general case, how they work is any worse than how we work. It's still early.

Your point, that they're structurally flawed, is noted -- but look at the average human and try to tell me that human reasoning is flawless. Human reasoning is perhaps even more unreliable. As for your detective game, how many humans, picked at random, could solve it?

> You have the framework very very wrong: the point is not that we memorize, the point is that those LLMs don't check.

Use DeepSeek R1 and try and tell me that it doesn't check. Not only does it check, it'll openly agonize over the answer it gives you. And at solving math problems for engineering purposes, it's in the 99.9th percentile of humans, if not far beyond, despite being ~1 year old. In edge cases, it's postgrad level. In the very near future, the successors of today's LLMs will be solving new theorems.

Reasoning models, in general, disprove what you're trying to state here. It's more costly, but they're capable of procedural thinking.

1 more reply

j / k navigate · click thread line to collapse

0 comments

A_D_E_P_T1y ago

I get the sense that you dislike them, but really LLMs are not so different. How they handle probability and prediction is different in degree, but I don't think that it's entirely different in kind.

You've never memorized your multiplication tables?

mdp2021OP1y ago

> "predict"

I must stress that the idea of "Science predicting facts" is a consolidated formula in Philosophy of Science.

> You've never memorized

* https://news.ycombinator.com/item?id=43284420

A_D_E_P_T1y ago

> I must stress that the idea of "Science predicting facts" is a consolidated formula in Philosophy of Science.

> * the predicting activities of a junkie under psychedelics and that of a lucid thinker are substantially different.*

There's also a substantial difference between the predicting activities of a cat and those of a man.

Scratch the surface, though, and the same type of thing is happening.

> You have the framework very very wrong: the point is not that we memorize, the point is that those LLMs don't check.

Reasoning models, in general, disprove what you're trying to state here. It's more costly, but they're capable of procedural thinking.

1 more reply

j / k navigate · click thread line to collapse