But there is also a more subtle thing, which is we're trending towards superintelligence with these AIs. At the point, Bob may discover that anything agents can't do, Alice can't do because she is limited by trying to think using soggy meat as opposed to a high-performance engineered thinking system. Not going to win that battle in the long term.
> The market will always value the exact things LLMs can not do, because if an LLM can do something, there is no reason to hire a person for that.
The market values bulldozers. Whether a human does actual work or not isn't particularly exciting to a market.
The article addresses this, because, well... no we aren't. Maybe we are. But it's far from clear that we're not moving toward a plateau in what these agents can do.
> Whether a human does actual work or not isn't particularly exciting to a market.
You seem to be convinced these AI agents will continue to improve without bound, so I think this is where the disconnect lies. Some of us (including the article author) are more skeptical. The market values work actually getting done. If the AIs have limits, and the humans driving them no longer have the capability to surpass those limits on their own, then people who have learned the hard way, without relying so much on an AI, will have an advantage in the market.
I already find myself getting lazy as a software developer, having an LLM verify my work, rather than going through the process of really thinking it through myself. I can feel that part of my skills atrophying. Now consider someone who has never developed those skills in the first place, because the LLM has done it for them. What happens when the LLM does a bad job of it? They'll have no idea. I still do, at least.
Maybe someday the AIs will be so capable that it won't matter. They'll be smarter and more through and be able to do more, and do it correctly, than even the most experienced person in the field. But I don't think that's even close to a certainty.
It is a debatable topic, and I agree with you that it's unclear whether we will hit the wall or not at some point. But one point I want to mention is that at the time when the AI agents were only conceived and the most popular type of """AI""" was LLM-based chatbot, it also seemed that we're approaching some kind of plateau in their performance. Then "agents" appeared, and this plateau, the wall we're likely to hit at some point, the boundary was pushed further. I don't know (who knows at all?) how far away we can push the boundaries, but who knows what comes next? Who knows, for example, when a completely new architecture different from Transformers will come out and be adopted everywhere, which will allow for something new? Future is uncertain. We may hit the wall this year, or we may not hit it in the next 10-20 years. It is, indeed, unclear.
P.S. I am well aware of all of the risks that agents brought. I'm speaking in terms of pure "maximum performance", so to speak.
It doesn't matter if Bob can be normal. There was no point to him being paid to be on the program.
From the article:
If you hand that process to a machine, you haven't accelerated science. You've removed the only part of it that anyone actually needed.
Yeah, I'm surprised at the number of people who read the article and came away with the conclusion that the program was designed to churn deliverables, and then they conclude that it doesn't matter if Bob can only function with an AI holding his hand, because he can still deliver.
That isn't the output of the program; the output is an Alice. That's the point of the program. They don't want the results generated by Alice, they want the final Alice.
I wouldn't count on that because even if it happens, we don't know when it ill happen, and it's one of those things where how close it looks to be is no indication of how close it actually is. We could just as easily spend the next 100 years being 10 years away from agi. Just look at fusion power, self driving cars, etc.
Whatever models suck at, we can pour money into making them do better. It's very cut and dry. The squirrely bit is how that contributes to "general intelligence" and whether the models are progressing towards overall autonomy due to our changes. That mostly matters for the AGI mouthbreathers though, people doing actual work just care that the models have improved.
do you have any evidence for that, though? Besides marketing claims, I mean.
If I would not type but speak this comment maybe 2 to 5 words would be wrong. For a human it is maybe 10% of that.
Do you have a solution for me? How does the market value things that don't yet exist in this brave new world?
> There's a common rebuttal to this, and I hear it constantly. "Just wait," people say. "In a few months, in a year, the models will be better. They won't hallucinate. They won't fake plots. The problems you're describing are temporary." I've been hearing "just wait" since 2023.
We're not trending towards superintelligence with these AIs. We're trending towards (and, in fact, have already reached) superintelligence with computers in general, but LLM agents are among the least capable known algorithms for the majority of tasks we get them to do. The problem, as it usually is, is that most people don't have access to the fruits of obscure research projects.
Untrained children write better code than the most sophisticated LLMs, without even noticing they're doing anything special.
I’ll take that bet. How much money would you like to put on this, and we’ll have a neutral third party pick both the untrained child and the LLM.
Let me know.
It is (rightly) difficult to get hold of one uninvolved child, for safeguarding reasons, so it would be better to run it as a school (or interschool) competition, where multiple children may participate. For fairness, you may also provide multiple LLM participants (however you define that). The winner of the contest, as determined by the judge, would then determine the winner of the bet – unless the winning child had been trained, in which case we would fall back to the next-highest-ranked participant. The number of LLM candidates would be equal to the number of eligible children.
However, I don't see a good way to allow each child to pick a programming language and task, without leaving the competition results incomparable. So perhaps each child should be paired with an LLM, and the judge should determine which submission from each pair is better? But then if I only need one victory (to support my claim), this is clearly unfair. So each pair should be tested enough to determine whether they're consistently better than the LLM… but then we are demanding a lot of the child participants, for no real benefit to them.
If we can agree on a workable protocol, I can try to pull some strings and see if we can make this happen. I could use the money.
The recent article where the AI companies are paying experts in the field to help train the models makes me wonder if they're also manually fixing a bunch of post-processing errors as they come up.
I don't care how many terms you add to your Taylor series: your polynomial approximation of a sine wave is never going to be suitable for additive speech synthesis. Likewise, I don't care how good your predictive-text transformer model gets at instrumental NLP subtasks: it will never be a good programmer (except as far as it's a plagiarist). Just look at the Claude Code source code: if anyone's an expert in agentic AI development, it's the Claude people, and yet the codebase is utterly unmaintainable dogshit that shouldn't work and, on further inspection, doesn't work.
That's not to say that no computer program can write computer programs, but this computer program is well into the realm of diminishing returns.
I have literally never run into this in my career..challenges have always been something to help me grow.
I would take that bet on the side of the wet meat. In the future, every AI will be an ad executive. At least the meat programming won't be preloaded to sell ads every N tokens.
If Bob is going to spend $500 in tokens for something I can do for $50.
I think Bob is not going to stay long in lawn mowing market driving a bulldozer.