undefined | Better HN

0 pointsabra02y ago0 comments

>rightfully so

How the hell can people be so confident about this? You describe two smart people reasonably disagreeing about a complicated topic

0 comments

jumploops2y ago

The LLMs of today are just multidimensional mirrors that contain humanity's knowledge. They don't advance that knowledge, they just regurgitate it, remix it, and expose patterns. We train them. They are very convincing, and show that the Turing test may be flawed.

Given that AGI means reaching "any intellectual task that human beings can perform", we need a system that can go beyond lexical reasoning and actually contribute (on it's own) to advance our total knowledge. Anything less isn't AGI.

Ilya may be right that a super-scaled transformer model (with additional mechanics beyond today's LLMs) will achieve AGI, or he may be wrong.

Therefore something more than an LLM is needed to reach AGI, what that is, we don't yet know!

dboreham2y ago

Prediction: there isn't a difference. The apparent difference is a manifestation of human brain delusion about how human brains work. The Turing test is a beautiful proof of this phenomenon: so and so thing is impossibility hard only achievable via magic capabilities of human brains...oops no actually it's easily achievable now so we better re-define our test. This cycle Will continue until the singularly. Disclosure: I've been long term skeptical about AI but that writing is up on the wall now.

mlyle2y ago

Clearly there's a difference, because the architectures we have don't know how to persist information or further train.

Without persistence outside of the context window, they can't even maintain a dynamic, stable higher level goal.

Whether you can bolt something small to these architectures for persistence and do some small things and get AGI is an open question, but what we have is clearly insufficient by design.

I expect it's something in-between: our current approaches are a fertile ground for improving towards AGI, but it's also not a trivial further step to get there.

visarga2y ago

But context windows got to 100K now, RAG systems are everywhere, and we can cheaply fine-tune LoRAs for a price similar with inferencing, maybe 3x more expensive per token. A memory hierarchy made of LoRA -> Context -> RAG could be "all you need".

My beef with RAG is that it doesn't match on information that is not explicit in the text, so "the fourth word of this phrase" won't embed like the word "of", or "Bruce Willis' mother's first name" won't match with "Marlene". To fix this issue we need to draw chain-of-thought inferences from the chunks we index in the RAG system.

So my conclusion is that maybe we got the model all right but the data is too messy, we need to improve the data by studying it with the model prior to indexing. That would also fix the memory issues.

Everyone is over focusing on models to the detriment of thinking about the data. But models are just data gradients stacked up, we forget that. All the smarts the model has come from the data. We need data improvement more than model improvement.

Just consider the "Textbook quality data" paper Phi-1.5 and Orca datasets, they show that diverse chain of thought synthetic data is 5x better than organic text.

1 more reply

satvikpendem2y ago

> Clearly there's a difference, because the architectures we have don't know how to persist information or further train. Without persistence outside of the context window, they can't even maintain a dynamic, stable higher level goal.

Nope, and not all people can achieve this as well. Would you call them less than humans than? I assume you wouldn't, as it is not only sentience of current events that maketh man. If you disagree, then we simply have fundamental disagreements on what maketh man, thus there is no way we'd have agreed in the first place.

Paul-Craft2y ago

Isn't RAG essentially the "something small you can bolt on" to an LLM that gives it "persistence outside the context window?" There's no reason you can't take the output of an LLM and stuff it into a vector database. And, if you ask it to create a plan to do a thing, it can do that. So, there you have it: goal-oriented persistence outside of the context window.

I don't claim that RAG + LLM = AGI, but I do think it takes you a long way toward goal-oriented, autonomous agents with at least a degree of intelligence.

razodactyl2y ago

From my experience there's definitely context beyond the current set of LLM state. It's how they're able to regurgitate facts or speak at all.

1 more reply

darkerside2y ago

> Without persistence outside of the context window, they can't even maintain a dynamic, stable higher level goal.

I mean, can't you say the same for people? We are easily confused and manipulated, for the most part.

1 more reply

mrangle2y ago

I agree with your premise.

You're right: I haven't seen evidence of LLM novel pattern output that is basically creative.

It can find and remix patterns where there are pre-existing rules and maps that detail where they are and how to use them (ie: grammar, phonics, or an index). But it can't, whatsoever, expose new patterns. At least public facing LLM's can't. They can't abstract.

I think that this is an important distinction when speaking of AI pattern finding, as the language tends to imply AGI behavior.

But abstraction (as perhaps the actual marker of AGI) is so different from what they can do now that it essentially seems to be futurism whose footpath hasn't yet been found let alone traversed.

When they can find novel patterns across prior seemingly unconnected concepts, then they will be onto something. When "AI" begins to see the hidden mirrors so to speak.

FeepingCreature2y ago

If LLMs can copy the symbolic behaviors that let humans generate new knowledge, it'll be there.

satvikpendem2y ago

> , they just regurgitate it, remix it, and expose patterns

Who cares? Sometimes the remixation of such patterns is what leads to new insights in us humans. It is dumb to think that remixing has no material benefit, especially when it clearly does.

bitcharmer2y ago

> They are very convincing, and show that the Turing test may be flawed

The only think flawed here is this statement. Are you even familiar with the premise of Turing test?

smilekzs2y ago

Maybe "rightfully so" meant "it is totally within Sam's right to claim that LLMs aren't sufficient for AGI"?

j / k navigate · click thread line to collapse

0 comments

jumploops2y ago

Ilya may be right that a super-scaled transformer model (with additional mechanics beyond today's LLMs) will achieve AGI, or he may be wrong.

Therefore something more than an LLM is needed to reach AGI, what that is, we don't yet know!

dboreham2y ago

mlyle2y ago

Clearly there's a difference, because the architectures we have don't know how to persist information or further train.

Without persistence outside of the context window, they can't even maintain a dynamic, stable higher level goal.

Whether you can bolt something small to these architectures for persistence and do some small things and get AGI is an open question, but what we have is clearly insufficient by design.

I expect it's something in-between: our current approaches are a fertile ground for improving towards AGI, but it's also not a trivial further step to get there.

visarga2y ago

Just consider the "Textbook quality data" paper Phi-1.5 and Orca datasets, they show that diverse chain of thought synthetic data is 5x better than organic text.

1 more reply

satvikpendem2y ago

Paul-Craft2y ago

I don't claim that RAG + LLM = AGI, but I do think it takes you a long way toward goal-oriented, autonomous agents with at least a degree of intelligence.

razodactyl2y ago

From my experience there's definitely context beyond the current set of LLM state. It's how they're able to regurgitate facts or speak at all.

1 more reply

darkerside2y ago

> Without persistence outside of the context window, they can't even maintain a dynamic, stable higher level goal.

I mean, can't you say the same for people? We are easily confused and manipulated, for the most part.

1 more reply

mrangle2y ago

I agree with your premise.

You're right: I haven't seen evidence of LLM novel pattern output that is basically creative.

I think that this is an important distinction when speaking of AI pattern finding, as the language tends to imply AGI behavior.

But abstraction (as perhaps the actual marker of AGI) is so different from what they can do now that it essentially seems to be futurism whose footpath hasn't yet been found let alone traversed.

When they can find novel patterns across prior seemingly unconnected concepts, then they will be onto something. When "AI" begins to see the hidden mirrors so to speak.

FeepingCreature2y ago

If LLMs can copy the symbolic behaviors that let humans generate new knowledge, it'll be there.

satvikpendem2y ago

> , they just regurgitate it, remix it, and expose patterns

Who cares? Sometimes the remixation of such patterns is what leads to new insights in us humans. It is dumb to think that remixing has no material benefit, especially when it clearly does.

bitcharmer2y ago

> They are very convincing, and show that the Turing test may be flawed

The only think flawed here is this statement. Are you even familiar with the premise of Turing test?

smilekzs2y ago

Maybe "rightfully so" meant "it is totally within Sam's right to claim that LLMs aren't sufficient for AGI"?

j / k navigate · click thread line to collapse