undefined | Better HN

0 pointsfragmede10mo ago0 comments

A system that self-updates its weights is so obvious the only question is who will be the first to get there?

0 comments

11 comments · 7 top-level

danenania10mo ago· 4 in thread

I’m not sure that self-updating weights is really analogous to “continuous learning” as humans do it. A memory data structure that the model can search efficiently might be a lot closer.

Self-updating weights could be more like epigenetics.

Jensson10mo ago

Human neurons are self updating though, we aren't running on our genes each cell is using our genes to determine how to connect to other cells and then the cell learns how to process some information there based on what it hears from its connected cells.

So, genes would be a meta model that then updates weights in the real model so it can learn how to process new kinds of things, and for stuff like facts you can use an external memory just like humans does.

Without updating the weights in the model you will never be able to learn to process new things like a new kind of math etc, since you learn that not by memorizing facts but by making new models for it.

HarHarVeryFunny10mo ago

There's a difference between memory and learning.

Would you rather your illness was diagnosed by a doctor or by a plumber with access to a stack of medical books ?

Learning is about assimilating lots of different sources of information, reconciling the differences, trying things out for yourself, learning from your mistakes, being curious about your knowledge gaps and contradictions, and ultimately learning to correctly predict outcomes/actions based on everything you have learnt.

You will soon see the difference in action as Anthropic apparently agree with you that memory can replace learning, and are going to be relying on LLMs with longer compressed context (i.e. memory) in place of ability to learn. I guess this'll be Anthropic's promised 2027 "drop-in replacement remote worker" - not an actual plumber unfortunately (no AGI), but an LLM with a stack of your company's onboarding material. It'll have perfect (well, "compressed") recall of everything you've tried to teach it, or complained about, but will have learnt nothing from that.

danenania10mo ago

I think my point is that when the doctor diagnoses you, she often doesn’t do so immediately. She is spending time thinking it through, and as part of that process is retrieving various pieces of relevant information from her memory (both long term and short term).

I think this may be closer to an agentic, iterative search (ala claude code) than direct inference using continuously updated weights. If it was the latter, there would be no process of thinking it through or trying to recall relevant details, past cases, papers she read years ago, and so on; the diagnosis would just pop out instantaneously.

1 more reply

imtringued10mo ago

In spiking neural networks, the model weights are equivalent to dendrites/synapses, which can form anew and decay during your lifetime.

soulofmischief10mo ago

It's not always as useful as you think from the perspective of a business trying to sell an automated service to users who expect reliability. Now you have to worry about waking up in the middle of the night to rewind your model to a last known good state, leading to real data loss as far as users are concerned.

Data and functionality become entwined and basically you have to keep these systems on tight rails so that you can reason about their efficacy and performance, because any surgery on functionality might affect learned data, or worse, even damage a memory.

It's going to take a long time to solve these problems.

HarHarVeryFunny10mo ago

Sure, it's obvious, but it's only one of the missing pieces required for brain-like AGI, and really upends the whole LLM-as-AI way of doing things.

Runtime incremental learning is still going to be based on prediction failure, but now it's no longer failure to predict the training set, but rather requires closing the loop and having (multi-modal) runtime "sensory" feedback - what were the real-world results of the action the AGI just predicted (generated)? This is no longer an auto-regressive model where you can just generate (act) by feeding the model's own output back in as input, but instead you now need to continually gather external feedback to feed back into your new incremental learning algorithm.

For a multi-modal model the feedback would have to include image/video/audio data as well as text, but even if initial implementations of incremental learning systems restricted themselves to text it still turns the whole LLM-based way of interacting with the model on it's head - the model generates text-based actions to throw out into the world, and you now need to gather the text-based future feedback to those actions. With chat the feedback is more immediate, but with something like software development far more nebulous - the model makes a code edit, and the feedback only comes later when compiling, running, debugging, etc, or maybe when trying to refactor or extend the architecture in the future. In corporate use the response to an AGI-generated e-mail or message might come in many delayed forms, with these then needing to be anticipated, captured, and fed back into the model.

Once you've replaced the simple LLM prompt-response mode of interaction with one based on continual real-world feedback, and designed the new incremental (Bayesian?) learning algorithm to replace SGD, maybe the next question is what model is being updated, and where does this happen? It's not at all clear that the idea of a single shared (between all users) model will work when you have millions of model instances all simultaneously doing different things and receiving different feedback on different timescales... Maybe the incremental learning now needs to be applied to a user-specific model instance (perhaps with some attempt to later share & re-distribute whatever it has learnt), even if that is still cloud based.

So... a lot of very fundamental changes need to be made, just to support self-learning and self-updates, and we haven't even discussed all the other equally obvious differences between LLMs and a full cognitive architecture that would be needed to support more human-like AGI.

imtringued10mo ago

I wonder when there will be proofs in theoretical computer science that an algorithm is AGI-complete, the same way there are proofs of NP-completeness.

Conjecture: A system that self updates its weights according to a series of objective functions, but does not suffer from catastrophic forgetting (performance only degrades due to capacity limits, rather than from switching tasks) is AGI-complete.

Why? Because it could learn literally anything!

tmountain10mo ago

I’m no expert, but it seems like self updating weights requires a grounded understanding of the underlying subject matter, and this seems like a problem current LLM systems.

emporas10mo ago

But then it is a specialized intelligence, specialized to altering it's weights. Reinforcement Learning doesn't work as well when the goal is not easily defined. It does wonders for games, but anything else?

Someone has to specify the goals, a human operator or another A.I. The second A.I. better be an A.G.I. itself, otherwise it's goals will not be significant enough for us to care.

fuckaj10mo ago

True. In the same way as making noises down a telephone line is the obvious way to build a million dollar business.

j / k navigate · click thread line to collapse

0 comments

11 comments · 7 top-level

danenania10mo ago· 4 in thread

I’m not sure that self-updating weights is really analogous to “continuous learning” as humans do it. A memory data structure that the model can search efficiently might be a lot closer.

Self-updating weights could be more like epigenetics.

Jensson10mo ago

HarHarVeryFunny10mo ago

There's a difference between memory and learning.

Would you rather your illness was diagnosed by a doctor or by a plumber with access to a stack of medical books ?

danenania10mo ago

1 more reply

imtringued10mo ago

In spiking neural networks, the model weights are equivalent to dendrites/synapses, which can form anew and decay during your lifetime.

soulofmischief10mo ago

It's going to take a long time to solve these problems.

HarHarVeryFunny10mo ago

Sure, it's obvious, but it's only one of the missing pieces required for brain-like AGI, and really upends the whole LLM-as-AI way of doing things.

imtringued10mo ago

I wonder when there will be proofs in theoretical computer science that an algorithm is AGI-complete, the same way there are proofs of NP-completeness.

Why? Because it could learn literally anything!

tmountain10mo ago

I’m no expert, but it seems like self updating weights requires a grounded understanding of the underlying subject matter, and this seems like a problem current LLM systems.

emporas10mo ago

Someone has to specify the goals, a human operator or another A.I. The second A.I. better be an A.G.I. itself, otherwise it's goals will not be significant enough for us to care.

fuckaj10mo ago

True. In the same way as making noises down a telephone line is the obvious way to build a million dollar business.

j / k navigate · click thread line to collapse