undefined | Better HN

0 pointsthemafia3d ago0 comments

> Human attention is truly ephemeral, with a ridiculously short span.

I do not believe this at all. I think you'd have to have a very limited experience working with other human beings to be able to believe this.

> and by training it better.

"Oh yea, just do it _better_. That's your problem." Perhaps some people operate without any context but most of us find the experience lacking.

0 comments

2 comments · 1 top-level

orbital-decay2d ago· 1 in thread

That's a simple neurologic fact, there's nothing to believe. You can use a trivial experiment to verify that you can't keep details in your sliding attention window for more than a few seconds or focus on more than a few things simultaneously. Human memory and cognition is layered and this immediate layer is what resembles the model's context the most.

You're possibly mixing it up with long-term memory which doesn't keep immediate facts and details, it's for heavily processed and compressed summaries, for the lack of a better analogy from the LLM world. You aren't keeping the entire codebase in your memory, just its highly processed and conceptualized version. This conceptualization can be somewhat emulated as an agentic loop, but it can only go so far, current models quickly lose coherency and aren't good enough to predict what's important.

Models don't need to remember more details, they need stronger processing of what they already remember.

>"Oh yea, just do it _better_. That's your problem."

I think we're talking about different things. Models can be trained better to cram more intelligence into the same amount of parameters, that's what I mean. Similarly to how your ability to learn (and perform) math depends on your prior math training.

themafiaOP2d ago

> You can use a trivial experiment to verify that you can't keep details in your sliding attention window for more than a few seconds or focus on more than a few things simultaneously.

I said "context window" not "attention window." Of course spans of attention are limited. Knowledge is not. Knowledge is often highly specific.

> You're possibly mixing it up

Not really. You've simply failed to verify your understanding of my argument and instead created something of a strawman.

> You aren't keeping the entire codebase in your memory, just its highly processed and conceptualized version

And what is your basis for this claim? Why a codebase? You don't think I can't remember an entire function? Yet actors can remember entire sets of lines for a scene? Is that just a highly processed and abstracted version in their minds? And they just run some cognitive loop to recreate dialog in real time?

> they need stronger processing of what they already remember.

What is "stronger?" More time? More memory? More compute? And how is that put to use? Why is it, when given certain prompts, that LLMs reproduce 100s of pages of directly copied and copyrighted work?

> Models can be trained better to cram more intelligence into the same amount of parameters, that's what I mean.

Cool. _How_? What is the limit of this training? How efficient is it? How many resources do you need on the input for a given increase in output? Otherwise it's just a ton of hand waving going on here.

j / k navigate · click thread line to collapse