undefined | Better HN

0 pointsalbertzeyer2y ago0 comments

You are more speaking about n-gram models here. NNs do far more than that.

Or if you just want to say that NNs are used as a statistical model here: Well, yea, but that doesn't really tell you anything. Everything can be a statistical model.

E.g., you could also say "this is exactly the way the human brain works", but it doesn't really tell you anything how it really works.

0 comments

7 comments · 2 top-level

mjburgess2y ago· 4 in thread

My description is true of any statistical learning algorithm.

The thing that people are looking to for answers, the NN itself, does not have them. That's like looking to Newton's compass to understand his general law of gravitation.

The reason that LLMs trained on the internet and every ebook has the structure of human communication is because the dataset has that structure. Why does the data have that structure? this requires science, there is no explanation "in the compass".

NNs are statistical models trained on data -- drawing analogies to animals is a mystification that causes people's ability to think clearly he to jump out the window. No one compares stock price models to the human brain; no banking regulator says, "well your volatility estimates were off because your machines had the wrong thoughts". This is pseudoscience.

Animals are not statistical learning algorithms, so the reason that's uninformative is because it's false. Animals are in direct causal contact with the world and uncover its structure through interventional action and counterfactual reasoning. The structure of animal bodies, and the general learning strategies are well-known, and having nothing to do with LLMs/NNs.

The reason that I know "The cup is in my hand" is not because P("The cup is in my hand"|HistoricalTexts) > P(not "The cup is in my hand"|HistoricalTexts)

vineyardmike2y ago

> The reason that I know "The cup is in my hand" is not because P("The cup is in my hand"|HistoricalTexts) > P(not "The cup is in my hand"|HistoricalTexts)

I mostly agree with your points, but I still disagree with this premise. Humans (and other animals) absolutely are statistical reasoning machines. They're just advanced ones which can process more than "text" - they're multi-modal.

As a super dumb-simple set of examples: Think about the origin of the phrase "Cargo Cult" and similar religious activities - people will absolutely draw conclusions about the world based on their learned observations. Intellectual "reasoning" (science!) really just relies on more probabilities or correlations.

The reason you know the cup is in your hand is because P("I see a cup and a hand"|HistoryOfEyesight) + P("I feel a cylinder shape"|HistoryOfTactileFeeling) + .... > P(Inverse). You can pretend it's because humans are intelligent beings with deep reasoning skills (not trying to challenge your smarts here!), but humans learn through trial and error just like a NN with reinforcement learning.

Close your eyes and ask a person to randomly place either a cup from your kitchen in your hand or a different object. You can probably tell which one is it is. Why? Because you have learned what it feels like, and learned countless examples of cups that are different, from years of passive practice. Thats basically deep learning.

mjburgess2y ago

I mean something specific by "statistics": modelling frequency associations in static ensembles of data.

Having a body which changes over time that interacts with a world that changes over time makes animal learning not statistical (call it, say, experimental). That animals fall into skinner-box irrational behaviour can be modelled as a kind of statistical learning, but it actually isnt.

It's a failure of ecological salience mechanisms in regulating the "experimental learning" that animals engage in. Eg., with the cargo cults the reason they adopted that view was because their society had a "big man" value system based on material acquisition and western waring powers seemed Very Big and so were humiliating. In order to retain their status they adopted (apparently irrational) theories of how the world worked (gods, etc).

From the outside this process might seem statistical, but it's the opposite. Their value system made material wealth have a different causal salience which was useful in their original ecology (a small island with small resources), but it went haywire when faced with the whole world.

Eventually these mechanisms update with this new information, or the tribe dies off -- but what's going wrong here is that the very very non-statistical learning ends up describable that way.

This is indeed, why we should be very concerned about people skinner-boxing themsleves with LLMs

2 more replies

Demlolomot2y ago

If learning in real life over 5-20 years shows the same result as a LLM being trained by billions of tokens, than yes it can be compared.

And there are a lot of people out there who do not a lot of reasoning.

After all optical illusions exist, our brain generalizes.

The same thing happens with words like the riddle about the doctor operating on a child were we discover that the doctor is actually a female.

And while llms only use text, we can already see how multimodal models become better, architecture gets better and hardware too.

mjburgess2y ago

I don't know what your motivation in comparison is; mine is science, ie., explanation.

I'm not interested that your best friend emits the same words in the same order as an LLM; i'm more interested that he does so because he enjoys you company whereas the LLM does not.

Engineer's overstep their mission when they assume that because you can substitute one thing for another, and sell a product in doing so, that this is informative. It isnt. I'm not interested in whether you can replace the sky for a skybox and have no one notice -- who cares? What might fool an ape is everything, and what that matters for science is nothing.

1 more reply

cornholio2y ago· 1 in thread

> "this is exactly the way the human brain works"

I'm always puzzled by such assertions. A cursory look at the technical aspects of an iterated attention - perceptron transformation clearly shows it's just a convoluted and powerful way to query the training data, a "fancy" Markov chain. The only rationality it can exhibit is that which is already embedded in the dataset. If trained on nonsensical data it would generate nonsense and if trained with a partially non-sensical dataset it will generate an average between truth and nonsense that maximizes some abstract algorithmic goal.

There is no knowledge generation going on, no rational examination of the dataset through the lens of an internal model of reality that allows the rejection of invalid premises. The intellectual food already chewed and digested in the form of the training weights, with the model just mechanically extracting the nutrients, as opposed to venturing in the outside world to hunt.

So if it works "just like the human brain", it does so in a very remote sense, just like a basic neural net works "just like the human brain", i.e individual biological neurons can be said to be somewhat similar.

pas2y ago

If a human spends the first 30 years of their life in a cult they will be also speaking nonsense a lot - from our point of view.

Sure, we have a nice inner loop, we do some pruning, picking and choosing, updating, weighting things based on emotions, goals, etc.

Who knows how complicated those things will prove to model/implement...

j / k navigate · click thread line to collapse

0 comments

7 comments · 2 top-level

mjburgess2y ago· 4 in thread

My description is true of any statistical learning algorithm.

The thing that people are looking to for answers, the NN itself, does not have them. That's like looking to Newton's compass to understand his general law of gravitation.

The reason that I know "The cup is in my hand" is not because P("The cup is in my hand"|HistoricalTexts) > P(not "The cup is in my hand"|HistoricalTexts)

vineyardmike2y ago

> The reason that I know "The cup is in my hand" is not because P("The cup is in my hand"|HistoricalTexts) > P(not "The cup is in my hand"|HistoricalTexts)

mjburgess2y ago

I mean something specific by "statistics": modelling frequency associations in static ensembles of data.

Eventually these mechanisms update with this new information, or the tribe dies off -- but what's going wrong here is that the very very non-statistical learning ends up describable that way.

This is indeed, why we should be very concerned about people skinner-boxing themsleves with LLMs

2 more replies

Demlolomot2y ago

If learning in real life over 5-20 years shows the same result as a LLM being trained by billions of tokens, than yes it can be compared.

And there are a lot of people out there who do not a lot of reasoning.

After all optical illusions exist, our brain generalizes.

The same thing happens with words like the riddle about the doctor operating on a child were we discover that the doctor is actually a female.

And while llms only use text, we can already see how multimodal models become better, architecture gets better and hardware too.

mjburgess2y ago

I don't know what your motivation in comparison is; mine is science, ie., explanation.

I'm not interested that your best friend emits the same words in the same order as an LLM; i'm more interested that he does so because he enjoys you company whereas the LLM does not.

1 more reply

cornholio2y ago· 1 in thread

> "this is exactly the way the human brain works"

pas2y ago

If a human spends the first 30 years of their life in a cult they will be also speaking nonsense a lot - from our point of view.

Sure, we have a nice inner loop, we do some pruning, picking and choosing, updating, weighting things based on emotions, goals, etc.

Who knows how complicated those things will prove to model/implement...

j / k navigate · click thread line to collapse