undefined | Better HN

0 pointsbyby3y ago0 comments

> I agree. LLMs are very impressive, but it isn't helpful to think of them of magic. LLMs are a great tool to explore and remix the body of human knowledge on the internet (limited to what it has been trained on).

Of course you shouldn't think of it as magic. But, the experts self admit they don't fully understand how LLMs can produce such output. It's definitely emergent behavior. We've built something we don't understand, and although it's not magic, it's one of the closest things to it that can exist. Think about it. What is the closest thing in reality to magic? Literally, building something we can't understand is it.

It's one thing to think of something as magic, it's another thing to try to simplify a highly complex concept into a box. When elon musk got his rockets to space why were people so floored by decades old technology that he simply made cheaper?

But when someone makes AI that can literally do almost anything you ask it to everyone just suddenly says it's a simple stochastic parrot that can't do much?

I think it's obvious. It's because a rocket can't replace your job or your identity. If part of your skillset and identity is "master programmer" and suddenly there's a machine that can do better than you, the easiest thing to stop that machine is to first deny reality.

0 comments

11 comments · 1 top-level

mjburgess3y ago· 10 in thread

> the experts self admit they don't fully understand how LLMs can produce such output

Well I take myself to be an expert in this area, and I think it's fairly obvious how they work. Many of these so-called "Experts" are sitting on the boards of commercial companies with vested interests in presenting this technology as revolutionary. Indeed, much of what has been said recently in the media is little more than political and economic power plays disguised as philosophical musings.

A statistical AI system is a function `answer = f(question; weights)`. The `answer` obtains apparent "emergent" properties such as "suitability for basic reasoning tasks" when used by human operators.

But the function does not actually have those properties. It's a trick -- the weights are summaries of unimaginable number of similar cases, and the function is little more than "sample from those cases and merge".

Properties of the output of this function obtain trivially in the way that all statistical functions generate increasingly useful output: by having increasingly relevant weights.

If you model linear data with just y = ax then as soon as you shift to "y = ax + b" you'll see the "emergent property" that the output is now sensitive to a background bias, b.

Emergence is an ontological phenomenon concerning how `f` would be reaslised by a physical system. In this case any physical system implementing `f` shows no such emergence.

Rather the output of `f` has a "shift in utility" as the properties of the data its training on, as summarised by the weights, "shifts in utilty".

In other words, if you train a statistical system on everything ever written by billions of people over decades, then you will in fact see "domains of applicability" increases, just as much as when you shift from a y=ax model to a y=ax+b.

To make this as simple as I can: statistical AI is just a funnel. ChatGPT is a slightly better funnel, but moreso, it's had the ocean pass through it.

Much of its apparent properties are illusary, and much of the press around it puts in cases where it appears to work and claims "look it works!". This is pseudoscience -- if you want to test a hypothesis of ChatGPT, find all the cases where it doesnt work -- and you will find that in the cases where it does there was some "statistical shortcut" taken

FeepingCreature3y ago

I think this is a motte-bailey, "true and trivial vs incredible and false" type of thing. Given a sufficiently flexible interpretation of "sample from multiple cases and merge", humans do the same thing. Given a very literal interpretation, this is obviously not what networks do - aside one paper to the contrary that relied on a very tortured interpretation of "linear", neural networks specifically do not output a linear combination of input samples.

And frankly, any interaction with even GPT 3.5 should demonstrate this. It's not hard to make the network produce output that was never in the training set at all, in any form. Even just the fact that its skills generalize across languages should already disprove this claim.

mchaver3y ago

> It's not hard to make the network produce output that was never in the training set at all, in any form.

Honest request because I am a bit skeptical, can you give an example of something it is not trained in any form and can give output for? And can it output something meaningful?

Because I have run a few experiments on ChatGPT for two spoken languages with standard written forms but without much of a presence on the internet and it just makes stuff up.

FeepingCreature3y ago

Well, it depends on the standard of abstraction that you accept. I don't think that ChatGPT has (or we've seen evidence of) any skills that weren't represented in its training set. But you can just invent an operation. For instance, something like, "ChatGPT: write code that takes a string that is even length and inverts the order of every second character." Actually, let me go try that...

And here we go! https://poe.com/s/UJxaAK9aVN8G7DLUko87 Note that it took me a long time, because GPT 3.5 really really wanted to misunderstand what I was saying; there is a strong bias to default to its training samples, especially if it's a common idea. But eventually, with only moderate pushing, its code did work.

What's interesting to me here is that after I threw the whole "step by step" shebang at it, it got code that was almost right. Surprisingly often, GPT will end up with code that's clever in methodology, but wrong in a very pedestrian way. IMO this means there has to be something wrong with the way we're training these networks.

edit: https://poe.com/s/gZW5ZGgiomWzabKJCUcA I gave it a more complete prompt because I only have one completion per day, but GPT-4 got it in one shot.

edit: https://poe.com/s/2lS8rjbGqHrzSkpEvLzr GPT 3.5 flubbed it given the same prompt.

cookieperson3y ago

Well you tried. You really did. But there are already people trying to form religions around LLMs. Some people can't be reasoned with.

pmoriarty3y ago

Are you speaking figuratively, or do you know of any specific instances of people forming actual religions around them? I'd be very interested in the latter.

cookieperson3y ago

I've seen people posting about it on a few message boards. Most of them sound like they e lost their minds or are under the influence being completely honest. I could try to dig up posts if you want but it's more sad than interesting.

1 more reply

bybyOP3y ago

He's just talking _. Clearly nobody here on both sides are having religious fervor around ai. One side is saying we don't understand LLMs completely and the other side is saying we absolutely do understand it's all statistical parroting.

But to keep it with the religious theme... which side sounds more similar to religion? The side that claims it's absolutely impossible for LLMs to be anything more then a statistical operation or the side that claims they don't know? One side seems to be making a claim based on faith while another side is saying we don't know enough to make a claim... So which side sounds more religious?

pmoriarty3y ago

"I take myself to be an expert in this area, and I think it's fairly obvious how they work"

We can also say we understand chemistry but we don't understand how consciousness comes out of chemistry.

You can also say that humans are "just" physical processes, but that word "just" is doing a lot of heavy lifting.

mjburgess3y ago

I'd also say I've sufficient expertise in animal learning to reject the idea that animals have shallow interior lives comprised of compressions of historical cases.

A child touches a fireplace once -- not a thousand times. Because they are in direct causal contact with the world and their body has a whole-organism biochemical reaction to that stimulus which radically conditions their bodies in all sorts of ways

This is a world apart from statistical learning wherein P(A|A causes B) and P(A|B) are indistinguishable -- and the bridge of "big data" merely illusory

bybyOP3y ago

>Well I take myself to be an expert in this area, and I think it's fairly obvious how they work. Many of these so-called "Experts" are sitting on the boards of commercial companies with vested interests in presenting this technology as revolutionary. Indeed, much of what has been said recently in the media is little more than political and economic power plays disguised as philosophical musings.

Bro if you are an expert you'd already know that most of the exclamations that they don't fully understand LLMs is coming from researchers at universities. Hinton was my example on an "expert" as well and he literally quit google just so he can say his piece. You know who Hinton is right? The person who repopularized backprop.

>A statistical AI system is a function `answer = f(question; weights)`. The `answer` obtains apparent "emergent" properties such as "suitability for basic reasoning tasks" when used by human operators.

Every layman gets its a multidimensional curve fitting process. The analogy your using here to apply properties of lower dimensional and lower degree equations to things that are millions of dimensions in size on a complex curve simply doesn't apply because nobody fully understands the macro details of the curve and how that maps to the output it's producing.

The properties of a 2d circle don't map one to one to 3d let alone 500000000d.

>Much of its apparent properties are illusary, and much of the press around it puts in cases where it appears to work and claims "look it works!". This is pseudoscience -- if you want to test a hypothesis of ChatGPT, find all the cases where it doesnt work -- and you will find that in the cases where it does there was some "statistical shortcut" taken

You don't even know what science is. Most of software engineering from design patterns to language choice to architecture is not science at all. There's no hypothesis testing or any of that. An expert (aka scientist) would be clear that ML is mostly mathematical theory with a huge dose of art layered on top.

The hypothesis for the AI in this case is, and I'm parroting the real experts here,: "we don't understand what's going on." That's the hypothesis. How is that even testable? It's not so none of this is "science". ML never was a science, it's an art with some theoretical origins.

But your "hypothesis" is it's just "statistical parroting" which is also untestable. But your claim is way more ludicrous because you made a claim and you can't prove it while I made a claim that basically says "we can't make any claims because we don't understand". See the difference?

j / k navigate · click thread line to collapse

0 comments

11 comments · 1 top-level

mjburgess3y ago· 10 in thread

> the experts self admit they don't fully understand how LLMs can produce such output

Properties of the output of this function obtain trivially in the way that all statistical functions generate increasingly useful output: by having increasingly relevant weights.

If you model linear data with just y = ax then as soon as you shift to "y = ax + b" you'll see the "emergent property" that the output is now sensitive to a background bias, b.

Emergence is an ontological phenomenon concerning how `f` would be reaslised by a physical system. In this case any physical system implementing `f` shows no such emergence.

Rather the output of `f` has a "shift in utility" as the properties of the data its training on, as summarised by the weights, "shifts in utilty".

To make this as simple as I can: statistical AI is just a funnel. ChatGPT is a slightly better funnel, but moreso, it's had the ocean pass through it.

FeepingCreature3y ago

mchaver3y ago

> It's not hard to make the network produce output that was never in the training set at all, in any form.

Honest request because I am a bit skeptical, can you give an example of something it is not trained in any form and can give output for? And can it output something meaningful?

Because I have run a few experiments on ChatGPT for two spoken languages with standard written forms but without much of a presence on the internet and it just makes stuff up.

FeepingCreature3y ago

edit: https://poe.com/s/gZW5ZGgiomWzabKJCUcA I gave it a more complete prompt because I only have one completion per day, but GPT-4 got it in one shot.

edit: https://poe.com/s/2lS8rjbGqHrzSkpEvLzr GPT 3.5 flubbed it given the same prompt.

cookieperson3y ago

Well you tried. You really did. But there are already people trying to form religions around LLMs. Some people can't be reasoned with.

pmoriarty3y ago

Are you speaking figuratively, or do you know of any specific instances of people forming actual religions around them? I'd be very interested in the latter.

cookieperson3y ago

1 more reply

bybyOP3y ago

pmoriarty3y ago

"I take myself to be an expert in this area, and I think it's fairly obvious how they work"

We can also say we understand chemistry but we don't understand how consciousness comes out of chemistry.

You can also say that humans are "just" physical processes, but that word "just" is doing a lot of heavy lifting.

mjburgess3y ago

I'd also say I've sufficient expertise in animal learning to reject the idea that animals have shallow interior lives comprised of compressions of historical cases.

This is a world apart from statistical learning wherein P(A|A causes B) and P(A|B) are indistinguishable -- and the bridge of "big data" merely illusory

bybyOP3y ago

The properties of a 2d circle don't map one to one to 3d let alone 500000000d.

j / k navigate · click thread line to collapse