As a user, it feels like the race has never been as close as it is now. Perhaps dumb to extrapolate, but it makes me lean more skeptical about the hard take-off / winner-take-all mental model that has been pushed.
Would be curious to hear the take of a researcher at one of these firms - do you expect the AI offerings across competitors to become more competitive and clustered over the next few years, or less so?
No, but I wouldn't be able to tell you what the player did wrong in general.
By contrast, the shortcomings of today's LLMs seem pretty obvious to me.
But one thing will stay consistent with LLMs for some time to come: they are programmed to produce output that looks acceptable, but they all unintentionally tend toward deception. You can iterate on that over and over, but there will always be some point where it will fail, and the weight of that failure will only increase as it deceives better.
Some things that seemed safe enough: Hindenburg, Titanic, Deepwater Horizon, Chernobyl, Challenger, Fukushima, Boeing 737 MAX.
one can intentionally use a recent and a much older model to figure out if the tests are reliable, and in which domains it is reliable.
one can compute a models joint probability for a sequence and compare how likely each model finds the same sequence.
we could ask both to start talking about a subject, but alternatingly each can emit a token. look again at how the dumber and smarter models judge the resulting sentence does the smart one tend to pull up the quality of the resulting text, or does it tend to get dragged down more towards the dumber participant?
given enough such tests to "identify the dummy vs smart one" and verifying them on common agreement (as an extreme word2vec vs transformer) to assess the quality of the test, regardless of domain.
on the assumption that such or similar tests allow us to indicate the smarter one, i.e. assuming we find plenty such tests, we can demand model makers publish open weights so that we can publically verify performance agreements.
Another idea is self-consistency tests: a single forward inference of context size say 2048 tokens (just an example) is effectively predicting the conditional 2-gram, 3-gram, 4-gram probabilities on the input tokens. so each output token distribution is predicted on the preceding inputs, so there are 2048 input tokens and 2048 output tokens, the position 1 output token is the predicted token vector (logit vector really) that is estimated to follow the given position 1 input vector, and the position 2 output vector is the prediction following the first 2 input vectors etc. and the last vector is the predicted next token following all the 2048 input tokens. p(t_(i+1) | t_1 =a, t_2=b, ..., t_i=z).
But that is just one way the next token can be predicted using the network: another approach would be to use RMAD gradient descent, but keeping model weights fixed, and only considering the last say 512 input vectors as variable, how well did the last 512 predicted forward prediction output vectors match the gradient descent best joint probability output vectors?
This could be added as a loss term during training as well, as a form of regularization, which turns it into a kind of Energy Based Model roughly.
Even if they've saturated the distinguishable quality for tasks they can both do, I'd expect a gap in what tasks they're able to do.
An average driver evaluating both would have a very hard time finding the f1s superior utility
Yes, because I'd get them to play each other?
I don't need to understand how the AI made the app I asked for or cured my cancer, but it'll be pretty obvious when the app seems to work and the cancer seems to be gone.
I mean, I want to understand how, but I don't need to understand how, in order to benefit from it. Obviously understanding the details would help me evaluate the quality of the solution, but that's an afterthought.
But I think we're not even on the path to creating AGI. We're creating software that replicate and remix human knowledge at a fixed point in time. And so it's a fixed target that you can't really exceed, which would itself already entail diminishing returns. Pair this with the fact that it's based on neural networks which also invariably reach a point of sharply diminishing returns in essentially every field they're used in, and you have something that looks much closer to what we're doing right now - where all competitors will eventually converge on something largely indistinguishable from each other, in terms of ability.
This doesn't really make sense outside computers. Since AI would be training itself, it needs to have the right answers, but as of now it doesn't really interact with the physical world. The most it could do is write code, and check things that have no room for interpretation, like speed, latency, percentage of errors, exceptions, etc.
But, what other fields would it do this in? How can it makes strives in biology, it can't dissect animals, it can't figure more out about plants that humans feed into the training data. Regarding math, math is human-defined. Humans said "addition does this", "this symbol means that", etc.
I just don't understand how AI could ever surpass anything human known before we live by the rules defined by us.
Why would you presume this? I think part of a lot of people's AI skepticism is talk like this. You have no idea. Full stop. Why wouldn't progress be linear? As new breakthroughs come, newer ones will be harder to come by. Perhaps it's exponential. Perhaps it's linear. No one knows.
All the technological revolutions so far have accounted for little more than a 1.5% sustained annual productivity growth. There are always some low-hanging fruit with new technology, but once they have been picked, the effort required for each incremental improvement tends to grow exponentially.
That's my default scenario with AGI as well. After AGI arrives, it will leave humans behind very slowly.
I think this is a hard kick below the belt for anyone trying to develop AGI using current computer science.
Current AIs only really generate - no, regenerate text based on their training data. They are only as smart as other data available. Even when an AI "thinks", it's only really still processing existing data rather than making a genuinely new conclusion. It's the best text processor ever created - but it's still just a text processor at its core. And that won't change without more hard computer science being performed by humans.
So yeah, I think we're starting to hit the upper limits of what we can do with Transformers technology. I'd be very surprised if someone achieved "AGI" with current tech. And, if it did get achieved, I wouldn't consider it "production ready" until it didn't need a nuclear reactor to power it.
They are unrelated. All you need is a way for continual improvement without plateauing, and this can start at any level of intelligence. As it did for us; humans were once less intelligent.
Using the flagship to bootstrap the next iteration with synthetic data is standard practice now. This was mentioned in the GPT5 presentation. At the rate things are going I think this will get us to ASI, and it's not going to feel epochal for people who have interacted with existing models, but more of the same. After all, the existing models are already smarter than most humans and most people are taking it in their stride.
The next revolution is going to be embodiment. I hope we have the commonsense to stop there, before instilling agency.
I would also not be surprised if the process of developing something comparable to human intelligence, assuming the extreme computation, energy, and materials issues of packing that much computation and energy into a single system could be overcome, the AI also develops something comparable to human desire and/or mental health issues. There is a not-zero chance we could end up with AI that doesn't want to do what we ask it to do or doesn't work all the time because it wants to do other things.
You can't just assume exponential growth is a forgone conclusion.
Why would the AI want to improve itself? From whence would that self-motivation stem?
AI can do it fine as it knows A and B. And that is knowledge creation.
It seems like the LLM model will be component of an eventual AGI, it's voice per se, but not its mind. The mind still requires another innovation or breakthrough we haven't seen yet.
I am not an AI researcher, but I have friends who do work in the field, and they are not worried about LLM-based AGI because of the diminishing returns on results vs amount of training data required. Maybe this is the bottleneck.
Human intelligence is markedly different from LLMs: it requires far fewer examples to train on, and generalizes way better. Whereas LLMs tend to regurgitate solutions to solved problems, where the solutions tend to be well-published in training data.
That being said, AGI is not a necessary requirement for AI to be totally world-changing. There are possibly applications of existing AI/ML/SL technology which could be more impactful than general intelligence. Search is one example where the ability to regurgitate knowledge from many domains is desirable
That being said, AGI is not a necessary requirement for AI to be totally world-changing
Yeah. I don't think I actually want AGI? Even setting aside the moral/philosophical/etc "big picture" issues I don't think I even want that from a purely practical standpoint.I think I want various forms of AI that are more focused on specific domains. I want AI tools, not companions or peers or (gulp) masters.
(Then again, people thought they wanted faster horses before they rolled out the Model T)
Models are truly input multimodal now. Feeding an image, feeding audio and feeding text all go into separate input nodes, but it all feeds into the same inner layer set and outputs text. This also mirrors how brains work more as multiple parts integrated in one whole.
Humans in some sense are not empty brains, there is a lot of stuff baked in our DNA and as the brain grows it develops a baked in development program. This is why we need fewer examples and generalize way better.
Instead of writing code with exacting parameters, future developers will write human-language descriptions for AI to interpret and convert into a machine representation of the intent. Certainly revolutionary, but not true AGI in the sense of the machine having truly independent agency and consciousness.
In ten years, I expect the primary interface of desktop workstations, mobile phones, etc will be voice prompts for an AI interface. Keyboards will become a power-user interface and only used for highly technical tasks, similar to the way terminal interfaces are currently used to access lower-level systems.
People say this, but honestly, it's not really my experience— I've given ChatGPT (and Copilot) genuinely novel coding challenges and they do a very decent job at synthesizing a new thought based on relating it to disparate source examples. Really not that dissimilar to how a human thinks about these things.
Depends on how you define "world changing" I guess, but this world already looks different to the pre-LLM world to me.
Me asking LLM's things instead of consulting the output from other humans now takes up a significant fraction of my day. I don't google near as often, I don't trust any image or video I see as swathes of the creative professions have been replaced by output from LLM's.
It's funny, that final thing is the last thing I would have predicted. I always believed the one thing a machine could not match was human creativity, because the output of machines was always precise, repetitive and reliable. Then LLM's come along, randomly generating every token. Their primary weakness is they neither precise or reliable, but they can turn out an unending stream of unique output.
But even with these it does not feel like AGI, that seems like the fusion reactor 20 years away argument, but instead this is coming in 2 years, but they have not even got the core technology of how to build AGI
I think you're on to it. Performance is clustering because a plateau is emerging. Hyper-dimensional search engines are running out of steam, and now we're optimizing.
For example, while you can get it to predict good chess moves if you train it on enough chess games, it can't really constrain itself to the rules of chess. (https://garymarcus.substack.com/p/generative-ais-crippling-a...)
Aren't we the summation of intelligence from quintillions of beings over hundreds of millions of years?
Have LLMs really had more data?
It's fascinating to me that so many people seem totally unable to separate the training environment from the final product
These AI computers aren’t thinking, they are just repeating.
That is because with LLMs there is no intelligence. It is Artificial Knowledge. AK not AI. So AI is AGI. Not that it matters for user-cases we have, but marketing needs 'AI' because that is what we were expecting for decades. So yeah, I also do not thing we will have AGI from LLMs - nor does it matter for what we are using it.
At Aloe, we are model agnostic and outperforming frontier models. It’s the anrchitecture around the LLM that makes the difference. For instance our system using Gemini can do things that Gemini can’t do on its own. All an LLM will ever do is hallucinate. If you want something with human-like general intelligence, keep looking beyond LLMs.
The fortunate thing is that we managed to invent an AI that is good at _copying us_ instead of being a truly maveric agent, which kinda limits it to the "average human" output.
However, I still think that all the doomer arguments are valid, in principle. We very well may be doomed in our lifetimes, so we should take the threat very seriously.
I don’t see anything that would even point into that direction.
Curious to understand where these thoughts are coming from
This isn’t rocket science.
This seems to be a result of using overly simplistic models of progress. A company makes a breakthrough, the next breakthrough requires exploring many more paths. It is much easier to catch up than find a breakthrough. Even if you get lucky and find the next breakthrough before everyone catches up, they will probably catch up before you find the breakthrough after that. You only have someone run away if each time you make a breakthrough, it is easier to make the next breakthrough than to catch up.
Consider the following game:
1. N parties take turns rolling a D20. If anyone rolls 20, they get 1 point.
2. If any party is 1 or more points behind, they get only need to roll a 19 or higher to get one point. That is being behind gives you a slight advantage in catching up.
While points accumulate, most of the players end up with the same score.
I ran a simulation of this game for 10,000 turns with 5 players:
Game 1: [852, 851, 851, 851, 851]
Game 2: [827, 825, 827, 826, 826]
Game 3: [827, 822, 827, 827, 826]
Game 4: [864, 863, 860, 863, 863]
Game 5: [831, 828, 836, 833, 834]
But yes, so far it feels like we are in the latter stages of the innovation S-curve for transformer-based architectures. The exponent may be out there but it probably requires jumping onto a new S-curve.
I think it's likely that we will eventually we hit a point of diminishing returns where the performance is good enough and marginal performance improvements aren't worth the high cost.
And over time, many models will reach "good enough" levels of performance including models that are open weight. And given even more time, these open weight models will be runnable on consumer level hardware. Eventually, they'll be runnable on super cheap consumer hardware (something more akin to a NPU than a $2000 RTX 5090). So your laptop in 2035 with specialize AI cores and 1TB of LPDDR10 ram is running GPT-7 level models without breaking a sweat. Maybe GPT-10 can solve some obscure math problem that your model can't but does it even matter? Would you pay for GPT-10 when running a GPT-7 level model does everything you need and is practically free?
The cloud providers will make money because there will still be a need for companies to host the models in a secure and reliable way. But a company whose main business strategy is developing the model? I'm not sure they will last without finding another way to add value.
This begs the question, why then do AI companies have these insane valuations? Do investorsknow something that we don't?
Presently we are still a long way from that. In my opinion we at least are as far away from AGI as 1970s mainframes were from LLMs.
I really don’t expect to see AGI in my lifetime.
You can only experience the world in one place in real time. Even if you networked a bunch of "experiencers" together to gather real time data from many places at the same time, you would need a way to learn and train on that data in real time that could incorporate all the simultaneous inputs. I don't see that capability happening anytime soon.
These big models don't dynamically update as days pass by - they don't learn. A personal assistant service may be able to mimic learning by creating a database of your data or preferences, but your usage isn't baked back into the big underlying model permanently.
I don't agree with "in our lifetimes", but the difference between training and learning is the bright red line. Until there's a model which is able to continually update itself, it's not AGI.
My guess is that this will require both more powerful hardware and a few more software innovations. But it'll happen.
I think we should be treating AGI like Cold Fusion, phrenology, or even alchemy. It is not science, but science fiction. It is not going to happen and no research into AGI will provide anything of value (except for the grifters pushing the pseudo-science).
It's all just hyperbole to attract investment and shareholder value and the people peddling the idea of AGI as a tangible possibility are charlatans whose goals are not aligned with whatever people are convincing themselves are the goals.
Thr fact that so many engineers have fallen for it so completely is stunning to me and speaks volumes on the underlying health of our industry.
However, I would not be so dismissive of the value. Many of us are reacting to the complete oversell of 'the encyclopedia' as being 'the eve of AGI' - as rightfully we should. But, in doing so, I believe it would be a mistake to overlook the incredible impact - and economic displacement - of having an encyclopedia comprised of all the knowledge of mankind that has "an interesting search interface" that is capable of enabling humans to use the interface to manipulate/detect connections between all that data.
The tech is neat and it can do some neat things but...it's a bullshit machine fueled by a bullshit machine hype bubble. I do not get it.
Yes. And the fact they're instead clustering simply indicates that they're nowhere near AGI and are hitting diminishing returns, as they've been doing for a long time already. This should be obvious to everyone. I'm fairly sure that none of these companies has been able to use their models as a force multiplier in state-of-the-art AI research. At least not beyond a 1+ε factor. Fuck, they're just barely a force multiplier in mundane coding tasks.
Thus, it’s easy to mistake one for the other - at least initially.
There were two interesting takeaways about AGI:
1. Dario makes the remark that the term AGI/ASI is very misleading and dangerous. These terms are ill defined and it's more useful to understand that the capabilities are simply growing exponentially at the moment. If you extrapolate that, he thinks it may just "eat the majority of the economy". I don't know if this is self-serving hype, and it's not clear where we will end up with all this, but it will be disruptive, no matter what.
2. The Economist moderators however note towards the end that this industry may well tend toward commoditization. At the moment these companies produce models that people want but others can't make. But as the chip making starts to hits its limits and the information space becomes completely harvested, capability-growth might taper off, and others will catch up. The quasi-monopoly profit potentials melting away.
Putting that together, I think that although the cognitive capabilities will most likely continue to accelerate, albeit not necessarily along the lines of AGI, the economics of all this will probably not lead to a winner takes all.
[1] https://www.economist.com/podcasts/2025/07/31/artificial-int...
I also feel like, it's stopped being exponential already. I mean last few releases we've only seen marginal improvements. Even this release feels marginal, I'd say it feels more like a linear improvement.
That said, we could see a winner take all due to the high cost of copying. I do think we're already approaching something where it's mostly price and who released their models last. But the cost to train is huge, and at some point it won't make sense and maybe we'll be left with 2 big players.
2. Commoditization can be averted with access to proprietary data. This is why all of ChatGPT, Claude, and Gemini push for agents and permissions to access your private data sources now. They will not need to train on your data directly. Just adapting the models to work better with real-world, proprietary data will yield a powerful advantage over time.
Also, the current training paradigm utilizes RL much more extensively than in previous years and can help models to specialize in chosen domains.
Seriously, our government just announced it's slashing half a billion dollars in vaccine research because "vaccines are deadly and ineffective", and it fired a chief statistician because the president didn't like the numbers he calculated, and it ordered the destruction of two expensive satellites because they can observe politically inconvenient climate change. THOSE are the people you are trusting to keep an eye on the pace of development inside of private, secretive AGI companies?
Do you mean from ChatGPT launch or o1 launch? Curious to get your take on how they bungled the lead and what they could have done differently to preserve it. Not having thought about it too much, it seems that with the combo of 1) massive hype required for fundraising, and 2) the fact that their product can be basically reverse engineered by training a model on its curated output, it would have been near impossible to maintain a large lead.
LLMs PATTERN MATCH well. Good at "fast" System 1 thinking, instantly generating intuitive, fluent responses.
LLMs are good at mimicking logic, not real reasoning. Simulate "slow," deliberate System 2 thinking when prompted to work step-by-step.
The core of an LLM is not understanding but just predicting the next most word in a sequence.
LLMs are good at both associative brainstorming (System 1) and creating works within a defined structure, like a poem (System 2).
Reasoning is the Achilles heel rn. AN LLM's logic can SEEM plausible, it's based on CORRELATION, NOT deductive reasoning.
If I were working in a job right now where I could see and guide and retrain these models daily, and realized I had a weapon of mass destruction on my hands that could War Games the Pentagon, I'd probably walk my discoveries back too. Knowing that an unbounded number of parallel discoveries were taking place.
It won't take AGI to take down our fragile democratic civilization premised on an informed electorate making decisions in their own interests. A flood of regurgitated LLM garbage is sufficient for that. But a scorched earth attack by AGI? Whoever has that horse in their stable will absolutely keep it locked up until the moment it's released.
Yesterday, Claude Opus 4.1 failed in trying to figure out that `-(1-alpha)` or `-1+alpha` is the same as `alpha-1`.
We are still a little bit away from AGI.
I don't think this has anything to do with AGI. We aren't at AGI yet. We may be close or we may be a very long way away from AGI. Either way, current models are at a plateau and all the big players have more or less caught up with each other.
As is, AI is quite intelligent, in that it can process large quantities of diverse unstructured information and build meaningful insights. And that intelligence applies across an incredibly broad set of problems and contexts. Enough that I have a hard time not calling it general. Sure, it has major flaws that are obvious to us and it's much worse at many things we care about. But that's doesn't make it not intelligent or general. If we want to set human intelligence as the baseline, we already have a word for that: superintelligence.
on the other hand, there are still some flaws regarding GPT-5. for example, when i use it for research it often needs multiple prompts to get the topic i truly want and sometimes it can feed me false information. so the reasoning part is not fully there yet?
Current models, when they apply reasoning, have feedback loops using tools to trial and error, and have a short term memory (context) or multiple short term memories if you use agents, and a long term memory (markdown, rag), they can solve problems that aren't hardcoded in their brain/model. And they can store these solutions in their long term memory for later use. Or for sharing with other LLM based systems.
AGI needs to come from a system that combines LLMs + tools + memory. And i've had situations where it felt like i was working with an AGI. The LLMs seem advanced enough as the kernel for an AGI system.
The real challenge is how are you going to give these AGIs a mission/goal that they can do rather independently and don't need constant hand-holding. How does it know that it's doing the right thing. The focus currently is on writing better specifications, but humans aren't very good at creating specs for things that are uncertain. We also learn from trial and error and this also influences specs.
However, I do believe that once the genuine AGI threshold is reached it may cause a change in that rate. My justification is that while current models have gone from a slightly good copywriter in GPT-4 to very good copywriter in GPT-5, they've gone from sub-exceptional in ML research to sub-exceptional in ML research.
The frontier in AI is driven by the top 0.1% of AI researchers. Since improvement in these models is driven partially by the very peaks of intelligence, it won't be until models reach that level where we start to see a new paradigm. Until then it's just scale and throwing whatever works at the GPU and seeing what comes out smarter.
I think you'll see the prophesized exponentiation once AI can start training itself at reasonable scale. Right now its not possible.
The AIs improve by gradient descent, still the same as ever. It's all basic math and a little calculus, and then making tiny tweaks to improve the model over and over and over.
There's not a lot of room for intelligence to improve upon this. Nobody sits down and thinks really hard, and the result of their intelligent thinking is a better model; no, the models improve because a computer continues doing basic loops over and over and over trillions of times.
That's my impression anyway. Would love to hear contrary views. In what ways can an AI actually improve itself?
This assumes an infinite potential for improvement though. It's also possible that the winner maxes out after threshold day plus one week, and then everyone hits the same limit within a relatively short time.
That seems hardy surprising considering the condition to receive the benefit has not been met.
The person who lights a campfire first will become warmer than the rest, but while they are trying to light the fire the others are gathering firewood. So while nobody has a fire, those lagging are getting closer to having a fire.
This misunderstanding is nothing more than the classic "logistic curves look like exponential curves at the beginning". All (Transformee-based, feedforward) AI development efforts are plateauing rapidly.
AI engineers know this plateau is there, but of course every AI business has a vested interest in overpromising in order to access more funding from naive investors.
That took the wold from autocomplete to Claude and GPT.
Another 10,000x would do it again, but who has that kind of money or R&D breakthrough?
The way scaling laws work, 5,000x and 10,000x give a pretty similar result. So why is it surprising that competitors land in the same range? It seems hard enough to beat your competitor by 2x let alone 10,000x
Part of the fun is that predictions get tested on short enough timescales to "experience" in a satisfying way.
Idk where that puts me, in my guess at "hard takeoff." I was reserved/skeptical about hard takeoff all along.
Even if LLMs had improved at a faster rate... I still think bottlenecks are inevitable.
That said... I do expect progress to happen in spurts anyway. It makes sense that companies of similar competence and resources get to a similar place.
The winner take all thing is a little forced. "Race to singularity" is the fun, rhetorical version of the investment case. The implied boring case is facebook, adwords, aws, apple, msft... IE the modern tech sector tends to create singular big winners... and therefore our pre-revenue market cap should be $1trn.
I personally think it's a pretty reductive model for what intelligence is, but a lot of people seem to strongly believe in it.
What do you think AGI is?
How do we go from sentence composing chat bots to General Intelligence?
Is it even logical to talk about such a thing as abstract general intelligence when every form of intelligence we see in the real world is applied to specific goals as evolved behavioral technology refined through evolution?
When LLMs start undergoing spontaneous evolution then maybe it is nearer. But now they can't. Also there is so much more to intelligence than language. In fact many animals are shockingly intelligent but they can't regurgitate web scrapings.
To be honest that is what you would want if you were digitally transforming the planet with AI.
You would want to start with a core so that all models share similar values in order they don't bicker etc, for negotiations, trade deals, logistics.
Would also save a lot of power so you don't have to train the models again and again, which would be quite laborious and expensive.
Rather each lab would take the current best and perform some tweak or add some magic sauce then feed it back into the master batch assuming it passed muster.
Share the work, globally for a shared global future.
At least that is what I would do.
AGI over LLMs is basically 1 billion tokens for AI to answer the question: how do you feel? and a response of "fine"
Because it would mean it's simulating everything in the world over an agentic flow considering all possible options checking memory checking the weather checking the news... activating emotional agentic subsystems, checking state... saving state...
They lack writable long-term memory beyond a context window. They operate without any grounded perception-action loop to test hypotheses. And they possess no executive layer for goal directed planning or self reflection...
Achieving AGI demands continuous online learning with consolidation.
This argument has so many weak points it deserves a separate article.
I wonder if that's because they have a lot of overlap in learning sets, algorithms used, but more importantly, whether they use the same benchmarks and optimize for them.
As the saying goes, once a metric (or benchmark score in this case) becomes a target, it ceases to be a valuable metric.
It's the systems around the models where the proprietary value lies.
It's natural if you extrapolate from training loss curves; a training process with continually diminishing returns to more training/data is generally not something that suddenly starts producing exponentially bigger improvements.
Nothing we have is anywhere near AGI and as models age others can copy them.
I personally think we are closing the end of improvement for LLMs with current methods. We have consumed all of the readily available data already, so there is no more good quality training material left. We either need new novel approaches or hope that if enough compute is thrown at training actual intelligence will spontaneously emerge.
SGI would be self-improving to some function with a shape close to linear based on the amount of time & resources. That's almost exclusively dependent on the software design, as currently transformers have shown to hit a wall at logarithmic progression x resources.
In other words, no, it has little to do with the commercial race.
This could be partly due to normative isomorphism[1] according to the institutional theory. There is also a lot of movement of the same folks between these companies.
Since then they've been about neck and neck with some models making different tradeoffs.
Nobody needs to reach AGI to take off. They just need to bankrupt their competitors since they're all spending so much money.
It's not architectures that matter anymore, it's unlocking new objectives and modalities that open another axis to scale on.
The improvements they make are marginal. How long until the next AI breakthrough? Who can tell? Because last time it took decenia.
That's only one part of it. Some forecasters put probabilities on each of the four quadrants in the takeoff speed (fast or slow) vs. power distribution (unipolar or multipolar) table.
2. ben evans frequently makes fun of the business value. pretty clear a lot of the models are commodotized.
3. strategically, the winners are platforms where the data are. if you have data in azure, that's where you will use your models. exclusive licensing could pull people to your cloud from on prem. so some gains may go to those companies ...
But I doubt we will ever see a fully autonomous, reliable AGI system.
The real take-off / winner-take-all potential is in retrieval and knowing how to provide the best possible data to the LLM. That strategy will work regardless of the model.
It's not obvious if a similar breakthrough could occur in AI
But nowdays, how corpos can "justify" their R&D to spend gigantic amount of resources (time + hardware + energy) in models which are not LLMs?
Even if we run with the assumption that LLMs can become human-level AI researchers, and are able to devise and run experiments to improve themselves, even then the runaway singularity assumption might not hold. Let's say Company A has this LLM, while company B does not.
- The automated AI researcher, like its human peers, still needs to test the ideas and run experiments, it might happen that testing (meaning compute) is the bottleneck, not the ideas, so Company A has no real advantage.
- It might also happen that AI training has some fundamental compute limit coming from information theory, analogous to the Shannon limit, and once again, more efficient compute can only approach this, not overcome it
It's probably never going to work with a single process without consuming the resources of the entire planet to run that process on.
Both the AGI threshold with LLM architecture, and the idea of self-advancing AI, is pie in the sky, at least for now. These are myths of the rationalist cult.
We'd more likely see reduced returns and smaller jumps between version updates, plus regression from all the LLM produced slop that will be part of the future data.
Meanwhile, keep all relevant preparations in secret...
Why is this even an axiom, that this has to happen and it's just a matter of time?
I don't see any credible argument for the path LLM -> AGI, in fact given the slowdown in enhancement rate over the past 3 years of LLMs, despite the unprecedented firehose of trillions of dollars being sunk into them, I think it points to the contrary!
I have a had a bunch of positive experiences as well, but when it goes bad, it goes so horribly bad and off the rails.
I think user experience and pricing models are the best here. Right now everyone’s just passing down costs as they come, no real loss leaders except a free tier. I looked at reviews of some of various wrappers on app stores, people say “I hate that I have to pay for each generation and not know what I’m doing to get”, market would like a service priced very differently. Is it economical? Many will fail, one will succeed. People will copy the model of that one.