undefined | Better HN

0 pointsroenxi1mo ago0 comments

That doesn't sound like much of an issue. Bob was already going to encounter problems that are too large and complex for him to solve, agents or otherwise. Life throws us hard problems. I don't recall if we even assumed Bob was unusually capable, he might be one of life's flounderers. I'd give good odds that if he got through a program with the help of agents he'll get through life achieving at least a normal level of success.

But there is also a more subtle thing, which is we're trending towards superintelligence with these AIs. At the point, Bob may discover that anything agents can't do, Alice can't do because she is limited by trying to think using soggy meat as opposed to a high-performance engineered thinking system. Not going to win that battle in the long term.

> The market will always value the exact things LLMs can not do, because if an LLM can do something, there is no reason to hire a person for that.

The market values bulldozers. Whether a human does actual work or not isn't particularly exciting to a market.

0 comments

kelnos1mo ago

> we're trending towards superintelligence with these AIs

The article addresses this, because, well... no we aren't. Maybe we are. But it's far from clear that we're not moving toward a plateau in what these agents can do.

> Whether a human does actual work or not isn't particularly exciting to a market.

You seem to be convinced these AI agents will continue to improve without bound, so I think this is where the disconnect lies. Some of us (including the article author) are more skeptical. The market values work actually getting done. If the AIs have limits, and the humans driving them no longer have the capability to surpass those limits on their own, then people who have learned the hard way, without relying so much on an AI, will have an advantage in the market.

I already find myself getting lazy as a software developer, having an LLM verify my work, rather than going through the process of really thinking it through myself. I can feel that part of my skills atrophying. Now consider someone who has never developed those skills in the first place, because the LLM has done it for them. What happens when the LLM does a bad job of it? They'll have no idea. I still do, at least.

Maybe someday the AIs will be so capable that it won't matter. They'll be smarter and more through and be able to do more, and do it correctly, than even the most experienced person in the field. But I don't think that's even close to a certainty.

zozbot2341mo ago

There's no good definition of superintelligence. A calculator is already way more capable than any human at doing simple mathematical operations, and even small AIs for local use can instantly recall all sorts of impressive knowledge about virtually any field of study, which would be unfeasible for any human; but neither of those is what people mean when they wonder whether future AIs will have superintelligence.

Jensson1mo ago

General superintelligence is more well defined, I assume that is what he meant. When I hear superintelligence I assume they just mean general superintelligence as in its better than humans at every single mental task that exists.

dryarzeg1mo ago

> But it's far from clear that we're not moving toward a plateau in what these agents can do.

It is a debatable topic, and I agree with you that it's unclear whether we will hit the wall or not at some point. But one point I want to mention is that at the time when the AI agents were only conceived and the most popular type of """AI""" was LLM-based chatbot, it also seemed that we're approaching some kind of plateau in their performance. Then "agents" appeared, and this plateau, the wall we're likely to hit at some point, the boundary was pushed further. I don't know (who knows at all?) how far away we can push the boundaries, but who knows what comes next? Who knows, for example, when a completely new architecture different from Transformers will come out and be adopted everywhere, which will allow for something new? Future is uncertain. We may hit the wall this year, or we may not hit it in the next 10-20 years. It is, indeed, unclear.

bee_rider1mo ago

Are agents something special? We already had LLMs that could call tools. Agents are just that, in a loop, right?

darkstarsys1mo ago

Yes, and this is an incredibly powerful idea. Running fluid flow simulations inside an optimization loop (monte carlo + gradient descent) revolutionized aircraft design, nuclear simulations and geophysics. When the tool being called updates the LLM's training data, or runs experiments that the LLM can learn from, then that potentially becomes a self-improvement loop.

dryarzeg1mo ago

Roughly speaking - yes. Still, it's an advancement - even if it's a small one - on the usual chatbots, right?

P.S. I am well aware of all of the risks that agents brought. I'm speaking in terms of pure "maximum performance", so to speak.

mattmanser1mo ago

The authors point went a little over your head.

It doesn't matter if Bob can be normal. There was no point to him being paid to be on the program.

From the article:

If you hand that process to a machine, you haven't accelerated science. You've removed the only part of it that anyone actually needed.

lelanthran1mo ago

> It doesn't matter if Bob can be normal. There was no point to him being paid to be on the program.

Yeah, I'm surprised at the number of people who read the article and came away with the conclusion that the program was designed to churn deliverables, and then they conclude that it doesn't matter if Bob can only function with an AI holding his hand, because he can still deliver.

That isn't the output of the program; the output is an Alice. That's the point of the program. They don't want the results generated by Alice, they want the final Alice.

alex_suzuki1mo ago

It’s a fairly long article, maybe they had it summarized and came to that conclusion…

SoftTalker1mo ago

And then you realize that most of science is unnecessary. As TFA points out, it doesn't matter if the age of the universe is 13.77 or 13.79 billion years. So you ban AI in science, you produce more scientists who can solve problems that don't matter. So what?

greedo1mo ago

Funny thing about that. Knowing what science is important is the hard part, and you often don't know for years or even decades.

dandellion1mo ago

> we're trending towards superintelligence with these AIs

I wouldn't count on that because even if it happens, we don't know when it ill happen, and it's one of those things where how close it looks to be is no indication of how close it actually is. We could just as easily spend the next 100 years being 10 years away from agi. Just look at fusion power, self driving cars, etc.

CuriouslyC1mo ago

Fusion isn't a good example. Self driving cars are a battle between regulation and 9's of reliability, if we were willing to accept self driving cars that crashed as much as humans it'd be here already.

Whatever models suck at, we can pour money into making them do better. It's very cut and dry. The squirrely bit is how that contributes to "general intelligence" and whether the models are progressing towards overall autonomy due to our changes. That mostly matters for the AGI mouthbreathers though, people doing actual work just care that the models have improved.

b00ty4breakfast1mo ago

>But there is also a more subtle thing, which is we're trending towards superintelligence with these AIs

do you have any evidence for that, though? Besides marketing claims, I mean.

roenxiOP1mo ago

I've always quite liked https://ourworldindata.org/grapher/test-scores-ai-capabiliti... to show that once AIs are knocking at the door of a human capability they tend to overshoot in around a decade.

Lionga1mo ago

This is just trash, like almost any AI benchmark. E.g. it says since around 2015 speech recognition is above human yet any any speech input today has more errors than any human would have.

If I would not type but speak this comment maybe 2 to 5 words would be wrong. For a human it is maybe 10% of that.

b00ty4breakfast1mo ago

we have to look at what LLMs are and are not doing for this to be applicable; they are not "thinking", there is no real cognition going on inside an LLM. They are making statistical connections between data points in their training sets. Obviously, that has born some pretty interesting (and sometimes even useful) results but they are not doing anything that any reasonably informed person would call "intelligent" and certainly not "super intelligent".

uoaei1mo ago

"Things that have never been done before in software" has been my entire career. A lot of it requires specific knowledge of physics, modelling, computer science, and the tradeoffs involved in parsimony and efficiency vs accuracy and fidelity.

Do you have a solution for me? How does the market value things that don't yet exist in this brave new world?

wizzwizz41mo ago

From the article:

> There's a common rebuttal to this, and I hear it constantly. "Just wait," people say. "In a few months, in a year, the models will be better. They won't hallucinate. They won't fake plots. The problems you're describing are temporary." I've been hearing "just wait" since 2023.

We're not trending towards superintelligence with these AIs. We're trending towards (and, in fact, have already reached) superintelligence with computers in general, but LLM agents are among the least capable known algorithms for the majority of tasks we get them to do. The problem, as it usually is, is that most people don't have access to the fruits of obscure research projects.

Untrained children write better code than the most sophisticated LLMs, without even noticing they're doing anything special.

blackqueeriroh1mo ago

> Untrained children write better code than the most sophisticated LLMs, without even noticing they're doing anything special.

I’ll take that bet. How much money would you like to put on this, and we’ll have a neutral third party pick both the untrained child and the LLM.

Let me know.

wizzwizz41mo ago

I'm willing to bet 10% of my net worth on this. But my claim was not about any given untrained child (for instance, a child who does not want to program would do poorly): a fair bet would allow me to choose the child, you to choose the LLM, use a task and programming language of the child's choice, and have a neutral third-party familiar with the programming language judge "better code". (I would, of course, want to ensure that the judge used an appropriate rubric: RLHF can produce a sophisticated turd-polisher. Perhaps the evaluation process could involve modifications made to the program?)

It is (rightly) difficult to get hold of one uninvolved child, for safeguarding reasons, so it would be better to run it as a school (or interschool) competition, where multiple children may participate. For fairness, you may also provide multiple LLM participants (however you define that). The winner of the contest, as determined by the judge, would then determine the winner of the bet – unless the winning child had been trained, in which case we would fall back to the next-highest-ranked participant. The number of LLM candidates would be equal to the number of eligible children.

However, I don't see a good way to allow each child to pick a programming language and task, without leaving the competition results incomparable. So perhaps each child should be paired with an LLM, and the judge should determine which submission from each pair is better? But then if I only need one victory (to support my claim), this is clearly unfair. So each pair should be tested enough to determine whether they're consistently better than the LLM… but then we are demanding a lot of the child participants, for no real benefit to them.

If we can agree on a workable protocol, I can try to pull some strings and see if we can make this happen. I could use the money.

jnovek1mo ago

The rate of hallucination has gone down drastically since 2023. As LLM coding tools continue to pare that rate down, eventually we’ll hit a point where it is comparable to the rate we naturally introduce bugs as humans programmers.

suzzer991mo ago

I wonder how much of the decrease in hallucination is because the models are getting better, and how much is because these massively over-funded companies are adding a bunch of one-off shims at breakneck speed. IE - are they truly improving the cognition, or just monkey-patching the hell out of it?

The recent article where the AI companies are paying experts in the field to help train the models makes me wonder if they're also manually fixing a bunch of post-processing errors as they come up.

wizzwizz41mo ago

LLMs are still making fundamentally the same kinds of errors that they made in 2021. If you check my HN comment history, you'll see I predicted these errors, just from skimming the relevant academic papers (which is to say they're obvious: I'm far from the only person saying this). There is no theoretical reason we should expect them to go away, unless the model architectures fundamentally change (and no, GPT -> LLaMA is not a fundamental change), because they're not removable discontinuities: they're indicative of fundamental capability gaps.

I don't care how many terms you add to your Taylor series: your polynomial approximation of a sine wave is never going to be suitable for additive speech synthesis. Likewise, I don't care how good your predictive-text transformer model gets at instrumental NLP subtasks: it will never be a good programmer (except as far as it's a plagiarist). Just look at the Claude Code source code: if anyone's an expert in agentic AI development, it's the Claude people, and yet the codebase is utterly unmaintainable dogshit that shouldn't work and, on further inspection, doesn't work.

That's not to say that no computer program can write computer programs, but this computer program is well into the realm of diminishing returns.

whateveracct1mo ago

> That doesn't sound like much of an issue. Bob was already going to encounter problems that are too large and complex for him to solve, agents or otherwise.

I have literally never run into this in my career..challenges have always been something to help me grow.

ModernMech1mo ago

> Not going to win that battle in the long term.

I would take that bet on the side of the wet meat. In the future, every AI will be an ad executive. At least the meat programming won't be preloaded to sell ads every N tokens.

ozim1mo ago

Market values bulldozers for bulldozing jobs. No one is going to use bulldozers to mow a lawn.

If Bob is going to spend $500 in tokens for something I can do for $50.

I think Bob is not going to stay long in lawn mowing market driving a bulldozer.

j / k navigate · click thread line to collapse

0 comments

kelnos1mo ago

> we're trending towards superintelligence with these AIs

The article addresses this, because, well... no we aren't. Maybe we are. But it's far from clear that we're not moving toward a plateau in what these agents can do.

> Whether a human does actual work or not isn't particularly exciting to a market.

zozbot2341mo ago

Jensson1mo ago

dryarzeg1mo ago

> But it's far from clear that we're not moving toward a plateau in what these agents can do.

bee_rider1mo ago

Are agents something special? We already had LLMs that could call tools. Agents are just that, in a loop, right?

darkstarsys1mo ago

dryarzeg1mo ago

Roughly speaking - yes. Still, it's an advancement - even if it's a small one - on the usual chatbots, right?

P.S. I am well aware of all of the risks that agents brought. I'm speaking in terms of pure "maximum performance", so to speak.

mattmanser1mo ago

The authors point went a little over your head.

It doesn't matter if Bob can be normal. There was no point to him being paid to be on the program.

From the article:

If you hand that process to a machine, you haven't accelerated science. You've removed the only part of it that anyone actually needed.

lelanthran1mo ago

> It doesn't matter if Bob can be normal. There was no point to him being paid to be on the program.

That isn't the output of the program; the output is an Alice. That's the point of the program. They don't want the results generated by Alice, they want the final Alice.

alex_suzuki1mo ago

It’s a fairly long article, maybe they had it summarized and came to that conclusion…

SoftTalker1mo ago

greedo1mo ago

Funny thing about that. Knowing what science is important is the hard part, and you often don't know for years or even decades.

dandellion1mo ago

> we're trending towards superintelligence with these AIs

CuriouslyC1mo ago

b00ty4breakfast1mo ago

>But there is also a more subtle thing, which is we're trending towards superintelligence with these AIs

do you have any evidence for that, though? Besides marketing claims, I mean.

roenxiOP1mo ago

I've always quite liked https://ourworldindata.org/grapher/test-scores-ai-capabiliti... to show that once AIs are knocking at the door of a human capability they tend to overshoot in around a decade.

Lionga1mo ago

This is just trash, like almost any AI benchmark. E.g. it says since around 2015 speech recognition is above human yet any any speech input today has more errors than any human would have.

If I would not type but speak this comment maybe 2 to 5 words would be wrong. For a human it is maybe 10% of that.

b00ty4breakfast1mo ago

uoaei1mo ago

Do you have a solution for me? How does the market value things that don't yet exist in this brave new world?

wizzwizz41mo ago

From the article:

Untrained children write better code than the most sophisticated LLMs, without even noticing they're doing anything special.

blackqueeriroh1mo ago

> Untrained children write better code than the most sophisticated LLMs, without even noticing they're doing anything special.

I’ll take that bet. How much money would you like to put on this, and we’ll have a neutral third party pick both the untrained child and the LLM.

Let me know.

wizzwizz41mo ago

If we can agree on a workable protocol, I can try to pull some strings and see if we can make this happen. I could use the money.

jnovek1mo ago

suzzer991mo ago

The recent article where the AI companies are paying experts in the field to help train the models makes me wonder if they're also manually fixing a bunch of post-processing errors as they come up.

wizzwizz41mo ago

That's not to say that no computer program can write computer programs, but this computer program is well into the realm of diminishing returns.

whateveracct1mo ago

> That doesn't sound like much of an issue. Bob was already going to encounter problems that are too large and complex for him to solve, agents or otherwise.

I have literally never run into this in my career..challenges have always been something to help me grow.

ModernMech1mo ago

> Not going to win that battle in the long term.

I would take that bet on the side of the wet meat. In the future, every AI will be an ad executive. At least the meat programming won't be preloaded to sell ads every N tokens.

ozim1mo ago

Market values bulldozers for bulldozing jobs. No one is going to use bulldozers to mow a lawn.

If Bob is going to spend $500 in tokens for something I can do for $50.

I think Bob is not going to stay long in lawn mowing market driving a bulldozer.

j / k navigate · click thread line to collapse