Now what....? Whats happening right now that should make me care that AGI is here (or not). Whats the magic thing thats happening with AGI that wasn't happening before?
<looks out of window> <checks news websites> <checks social media...briefly> <asks wife>
Right, so, not much has changed from 1-2 years ago that I can tell. The job markets a bit shit if you're in software...is that what we get for billions of dollars spent?
The writing is on the wall. Even if there's no new advances in technology, the current state is upending jobs, education, media, etc
It took one September. Then as soon as you could take payments on the internet the rest was inevitable and in _clear_ demand. People got on long waiting lists just to get the technology in their homes.
> no new advances in technology
The reason the internet became so accessible is because Moore was generally correct. There was two corresponding exponential processes that vastly changed the available rate of adoption. This wasn't at all like cars being introduced into society. This was a monumental shift.
I see no advances in LLMs that suggest any form of the same exponential processes exist. In fact the inverse is true. They're not reducing power budgets fast enough to even imagine that they're anywhere near AGI, and even if they were, that they'd ever be able to sustainably power it.
> the current state is upending jobs
The difference is companies fought _against_ the internet because it was so disruptive to their business model. This is quite the opposite. We don't have a labor crisis, we have a retention crisis, because companies do not want to pay fair value for labor. We can wax on and off about technology, and perceptrons, and training techniques, or power budgets, but this fundamental fact seems the hardest to ignore.
If they're wrong this all collapses. If I'm wrong I can learn how to write prompts in a week.
It's the classic "slowly, then suddenly" paradigm. It took decades to get to that one September. Then years more before we all had internet in our pocket.
> The reason the internet became so accessible is because Moore was generally correct.
Can you explain how Moore's law is relevant to the rise of the internet? People didn't start buying couches online because their home computer lacked sufficient compute power.
> I see no advances in LLMs that suggest any form of the same exponential processes exist.
LLMs have seen enormous growth in power over the last 3 years. Nothing else comes close. I think they'll continue to get better, but critically: even if LLMs stay exactly as powerful as they are today, it's enough to disrupt society. IMHO we're already at AGI.
> The difference is companies fought _against_ the internet
Some did, some didn't. As in any cultural shift, there were winners and losers. In this shift, too, there will be winner and losers. The panicked spending on data centers right now is a symptom of the desire to be on the right side of that.
> because companies do not want to pay fair value for labor.
Companies have never wanted to pay fair value for labor. That's a fundamental attribute of companies, arising as a consequence of the system of incentives provided in capitalism. In the past, there have been opportunities for labor to fight back: government regulation, unions. This time that won't help.
> If I'm wrong I can learn how to write prompts in a week.
Why would you think that anyone would want you to write prompts?
Rapid de industrialization followed by the internet and social media almost broke our society.
Also, I don’t think people necessarily realize how close we were to the cliff in 2007.
I think another transformation now would rip society apart rather than take us to the great beyond.
Most of all, AI will exacerbate the lack of trust in people and institutions that was kicked into high gear by the internet. It will be easy and cheap to convince large numbers of people about almost anything.
The GFC was a big recession, but I never thought society was near collapse.
The internet made it so that you can share and access information in a few minute if not seconds.
Smart phones build on the internet by making this sharing and access of information could done from anywhere and by anyone.
AI seems occupies the same space as google in the broader internet ecosystem.I dont know what AI provides me that a few hours of Google searches. It makes information retrieval faster, but that was the never the hard part. The hard part was understanding the information, so that you're able to apply it to your particalar situation.
Being able to write to-do apps X1000 faster is not innovation!
The rest of the world has mostly been experiencing industrialisation, and was only indirectly affected by the great crash.
If there is a transformation in the rest of the world the west cannot escape it.
A lot of people in the west seem to have their heads in the sand, very much like when Japan and China tried to ignore the west.
China is the world's second biggest economy by nominal GDP, India the fourth. We have a globalised economy where everything is interlinked.
99% of people only ever use proprietary networks from FAANG corporations. That's not "the internet", that's an evolution of CompuServe and AOL.
We got TCP/IP and the "web-browser" as a standard UI toolkit stack out of it, but the idea of the world wide web is completely dead.
Just about the time it hit the mainstream coincidentally, is when the enshitification began to go exponential. Be careful what you wish for.
My usual way of thinking about it is AGI means can do all the stuff humans do which means you'd probably after a while look out the window and see robots building houses and the like. I don't think that's happening for a while yet.
Now, I do not in the least believe that we have created AGI, nor that we are actually close. But you're absolutely right that we can't just handwave away the definitions. They are crucial both to what it means to have AGI, and to whether we do (or soon will) or not.
After enlightenment^WAGI: chop wood, fetch water, prepare food
Its core thesis was: Every era doubled the amount of technological change of the prior era in one half the time.
At the time he wrote the book in 1970, he was making the point that the pace of technological change had, for the first time in human history, rendered the knowledge of society's elders - previously the holders of all valuable information - irrelevant.
The pace of change has continued to steadily increase in the ensuing 55 years.
Edit: grammar
A slightly different angle on this - perhaps AGI doesn't matter (or perhaps not in the ways that we think).
LLMs have changed a lot in software in the last 1-2 years (indeed, the last 1-2 months); I don't think it's a wild extrapolation to see that'll come to many domains very soon.
Even if it doesn't, you will be indirectly affected. People will flock to trades if knowledge work is no longer a source of viable income.
There is a definition of AGI the AI companies are using to justify their valuation. It's not what most people would call AGI but it does that job well enough, and you will care when it arrives.
They define it as an AI that can develop other AI's faster than the best team of human engineers. Once they build one of those in house they outpace the competition and become the winner that takes all. Personally I think it's more likely they will all achieve it at a similar time. That would mean the the race will continues, accelerating as fast as they can build data centres and power plants to feed them.
It will impact everyone, because the already dizzying pace of the current advances will accelerate. I don't know about you, but I'm having trouble figuring out what my job will be next year as it is.
An AI that just develops other AI's could hardly be called "general" in my book, but my opinion doesn't count for much.
None, as I don't develop LLM's.
I wasn't saying I think they will succeed, but I think it is worth noting their AGI ambitions are not as grand as the term implies. Nonetheless, if they achieve them, the world will change.
Remember that weather balloon the US found a few years ago that for days was on the news as a Chinese spy balloon?
Well whether it was a spy balloon or a weather balloon but the first hint of its existence could have triggered a nuclear war that could have already been the end of the world as we know it because AGI will almost certainly be deployed to control the U.S. and Chinese military systems and it would have acted well before any human would have time to intercept its actions.
That’s the apocalyptic nuclear winter scenario.
There are many other scenarios.
An AGI which has been infused with a tremendous amount of ethics so the above doesn’t happen, may also lead to terrible outcomes for a human. An AGI would essentially be a different species (although a non biological one). If it replicated human ethics even when we apply them inconsistently, it would learn that treating other species brutally (we breed, enslave, imprison, torture, and then kill over 80 billion land animals annually in animal agriculture, and possibly trillions of water animals). There’s no reason it wouldn’t do that to us.
Finally, if we infuse it with our ethics but it’s smart enough to apply them consistently (even a basic application of our ethics would have us end animal agriculture immediately), so it realizes that humans are wrong and doesn’t do the same thing to humans, it might still create an existential crisis for humans as our entire identity is based on thinking we are smarter and intellectually superior to all other species, which wouldn’t be true anymore. Further it would erode beliefs in gods and other supernatural BS we believe which might at the very least lead humans to stop reproducing due to the existential despair this might cause.
And as for the Chinese spy balloon, there was never any risk of a war (at least not from that specific cause). The US, China, Russia, and other countries routinely spy on each other through a variety of unarmed technical means. Occasionally it gets exposed and turns into a diplomatic incident but that's about it. Everyone knows how the game is played.
https://gizmodo.com/for-20-years-the-nuclear-launch-code-at-...
Strong agentic AIs are a death sentence memo pad (or a malevolent djinn lamp if you like) that anyone can write on, because the tools will be freely available to leverage. A plutonium breeder reactor in every backyard. Try not to think of paperclips.
Yeah, it really doesn't matter if AGI has happened, is going to happen, will never happen, whatever. No matter what sort of definition we make for it, someone's always doing to disagree anyway. For a looong time, we thought the Turing test was the standard, and that only a truly intelligent computer could beat it. It's been blown out of the water for years now, and now we're all arguing about new definitions for AGI
At the end of the day, like you say, it doesn't matter a bit how we define terms. We can label it whatever we want, but the label doesn't change what it can DO
What it can DO is the important part. I think a lot of software devs are coming to terms with the idea that AI will be able to replace vast chunks of our jobs in the very near future.
If you use these things heavily, you can see the trajectory.
6 months ago I'd only trust them for boiler plate code generation and writing/reviewing short in-line documentation.
Today, with the latest models and tools, I'm trusting them with short/low impact tasks (go implement this UI fix, then redeploy the app locally, navigate to it, and verify the fix looks correct).
6 months from now, my best guess is that they'll continue to become more capable of handling longer + more complex tasks on their own.
5 years from now, I'm seeing a real possibility that they'll be handling all the code, end to end.
Doesn't matter if we call that AGI or not. It very much will matter whose jobs get cut, because one person with AI can do the work of 20 developers
>Now what....? Whats happening right now that should make me care that AGI is here (or not).
Do you have any insight into what those changes might concretely be? Or are you just trying to instil fear in people who lack critical thinking skills?
I think what you are trying to say is can we define AGI so that we can have an intelligent conversation about what that will mean for our daily lives?. But you oddly introduced your argument by stating you didn't want to explore this definition...
That's Trump's economy, not LLMs.
Many people slowly losing jobs and can’t find new ones. You’ll see effects in a few years
In what units?
I don’t mean 3x compounding monthly every month, I mean 3x total since I started using Claude Code about 6 months ago but the benefits keep compounding.
Firefox introducing their dev debugger many years ago "completely changed my life and the way I write code and run my business"
You get the idea. Yes, the day to day job of software engineering has changed. The world at large cares not one jot.
Are you making 3x the money compounding monthly ?
No?
Then what's the point?
You're not fooling anyone
Has it runaway yet? Not sure, but is it currently in the process of increasing intelligence with little input from us? Yes.
Exponential graphs always have a slow curve in the beginning.
This is true in a specific contextual sense (each token that an LLM produces is from a feed-forward pass). But untrue for more than a year with reasoning models, who feed their produced tokens back as inputs, and whose tuning effectively rewards it for doing this skillfully.
Heck, it was untrue before that as well, any time an LLM responded with more than one token.
> A [March] 2025 survey by the Association for the Advancement of Artificial Intelligence (AAAI), surveying 475 AI researchers, found that 76% believe scaling up current AI approaches to achieve AGI is "unlikely" or "very unlikely" to succeed.
I dunno. This survey publication was from nearly a year ago, so the survey itself is probably more than a year old. That puts us at Sonnet 3.7. The gap between that and present day is tremendous.
I am not skilled enough to say this tactfully, but: expert opinions can be the slowest to update on the news that their specific domain may have, in hindsight, have been the wrong horse. It's the quote about it being difficult to believe something that your income requires to be false, but instead of income it can be your whole legacy or self concept. Way worse.
> My take is that research taste is going to rely heavily on the short-duration cognitive primitives that the ARC highlights but the METR metric does not capture.
I don't have an opinion on this, but I'd like to hear more about this take.
> who feed their produced tokens back as inputs, and whose tuning effectively rewards it for doing this skillfully
Ah, this is a great point, and not something that I considered. I agree that the token feedback does change the complexity, and it seems that there's even a paper by the same authors about this very thing! https://arxiv.org/abs/2310.07923
I'll have to think on how that changes things. I think it does take the wind out of the architecture argument as it's currently stated, or at least makes it a lot more challenging. I'll consider myself a victim of media hype on this, as I was pretty sold on this line of argument after reading this article https://www.wired.com/story/ai-agents-math-doesnt-add-up/ and the paper https://arxiv.org/pdf/2507.07505 ... who brush this off with:
>Can the additional think tokens provide the necessary complexity to correctly solve a problem of higher complexity? We don't believe so, for two fundamental reasons: one that the base operation in these reasoning LLMs still carries the complexity discussed above, and the computation needed to correctly carry out that very step can be one of a higher complexity (ref our examples above), and secondly, the token budget for reasoning steps is far smaller than what would be necessary to carry out many complex tasks.
In hindsight, this doesn't really address the challenge.
My immediate next thought is - even solutions up to P can be represented within the model / CoT, do we actually feel like we are moving towards generalized solutions, or that the solution space is navigable through reinforcement learning? I'm genuinely not sure about where I stand on this.
> I don't have an opinion on this, but I'd like to hear more about this take.
I'll think about it and write some more on this.
Hands on experience is better than reading articles.
I've been coding for 40 years and after a few months getting familiar with these tools, this feels really big. Like how the internet felt in 1994.
Fun observation - almost every coding harness (claude code, cursor, codex) uses a find/replace tool as the primary way of interacting with code. This requires the agent to fully type out the code it's trying to edit, including several lines of context around the edit. This is really inefficient, token wise! Why does it work this way? Because the LLMs are really bad at counting lines, or using other ways of describing a unique location in the file.
I've experimented with providing a more robust dsl for text manipulation https://github.com/dlants/magenta.nvim/blob/main/node/tools/... , and I do think it's an improvement over just straight search/replace, but the agents do tend to struggle a lot - editing the wrong line, messing up the selection state, etc... which is probably why the major players haven't adopted something like this yet.
So I feel pretty confident in my assessment of where these models are at!
And also, I fully believe it's big. It's a huge deal! My work is unrecognizable from what it was even 2 years ago. But that's an impact / productivity argument, not an argument about intelligence. Modern programming languages, IDEs, spreadsheets, etc... also made a fundamental shift in what being a software engineer was like, but they were not generally intelligent.
You run it again, with a bigger input. If it needs to do a loop to figure out what the next token should be (Ex. The result is: X), it will fail. Adding that token to the input and running it again is too late. It has already been emitted. The loop needs to occur while "thinking" not after you have already blurted out a result whether or not you have sufficient information to do so.
Not sure I follow. Are you saying that AI researchers would be out of a job if scaling up transformers leads to AGI? How? Or am I misunderstanding your point.
Reconciling your self-concept with the negative (or fruitless) impacts of your life's work is difficult. It can be easier to deny or minimize those impacts.
Or is your reasoning that they will be upset about not having invented it themselves (similar to those conspiracy theories about, the cure for cancer existing but scientists withholding it so they can keep doing treatment research)?
I dunno, mixed bag. Value is positive if you can sort the wheat from the chaff for the use cases I've ran by it. I expect the main place it'll shine for the near and medium term is going over huge data sets or big projects and flagging things for review by humans.
All this being said, what I was throwing at it was really not what it was optimized for, and it still delivered some really good ideas.
Although the poster had a bus company business plan that includes actuarial analysis in his head and some spreadsheets so that bar appears to be sufficiently high.
In other words - If he was low skilled, AI would impress him. Now that he is high skilled, AI still impressed him.
In other words - AI improves on what a human can do.
Maybe AGI's arrival is when one day someone is given an AI to supervise instead of a new employee.
Just a user who's followed the whole mess, not a researcher. I wonder if the scaffolding and bolt-ons like reasoning will sufficiently be an asymptote to 'true AGI'. I kept reading about the limits of transformers around GPT-4 and Opus 3 time, and then those seem basic compared to today.
I gave up trying to guess when the diminishing returns will truly hit, if ever, but I do think some threshold has been passed where the frontier models are doing "white collar work as an API" and basic reasoning better than the humans in many cases, and once capital familiarizes themselves with this idea more, it's going to get interesting.
A couple days ago he was telling me one lady he was trying to sell on it wouldn’t use it. She took the position that if she can’t trust the answers all the time, she isn’t going to trust or use it for anything. My dad almost seemed offended by this idea, he couldn’t understand why someone wouldn’t want the benefits it could offer, even if it wasn’t perfect.
I think her position was very sound. We see how much misinformation spreads online and how vulnerable people are to it. Wanting a trusted source of information is not a bad thing. Getting information more quickly is of little value if it isn’t reliable data.
If I prod my dad enough about it, he will admit that ChatGPT has made some mistakes that he caught. He knew enough to question it more when it was wrong. The problem is, if he already knew the answer, why was he asking in the first place… and if it was something he wasn’t well versed on, how does he know it’s giving him good data?
People are defaulting to trust, unless they catch the LLM in a lie. How many times does someone have to lie to a person before they are labeled a liar and no longer trusted at face value? For me, these LLMs have been labeled a liar and I don’t trust them. Trust takes a long time to rebuild once it’s broken.
I mostly use LLMs to augment search, not replace it. If it gives me an answer, I’ll click through to the sourced reference and see what it says there, and evaluate if it’s a source with trusting. In many cases the LLM will get me to the right page, but it will jumble up the details and get them wrong, like a bad game of telephone.
There doesn't seem to be a reason why AIs should act as these distinct entities that manage each other or form teams or whatever.
It seems to me way more likely that everything will just be done internally in one monolithic model. The AIs just don't have the constraints that humans have in terms of time management, priority management, social order, all the rest of it that makes teams of individuals the only workable system.
AI simply scales with the compute resources made available, so it seems like you'd just size those resources appropriately for a problem, maybe even on demand, and have a singluar AI entity (if it's even meaningful to think of it as such, even that's kind of an anthropomorphisation) just do the thing. No real need for any organisational structure beyond that.
So I'd think maybe the opposite, seems like what agents really means is a way to use fundamentally narrow/limited AI inside our existing human organisations and workflows, directed by humans. Maybe AGI is when all that goes away because it's just obviously not necessary any more.
There's more than one way to do intelligence. Basic intelligence has evolved independently three times that we know of - mammals, corvids, and octopuses. All three show at least ape-level intelligence, but the species split before intelligence developed, and the brain architectures are quite different. Corvids get more done with less brain mass than mammals, and don't have a mammalian-type cortex. Octopuses have a distributed brain architecture, and have a more efficient eye design than mammals.
For a clear analogy, consider how tokenization causes LLMs to behave stupidly in certain cases, even though they're very capable in others.
“I’m not an ML expert and I haven’t read your article, but here’s my amazing experience with LLM Agents that changed my life:”
"I’m not a mechanical engineer, but I watched a five-minute YouTube video on how a diesel engine works, so I can tell you that mechanical engineering is a solved problem."
It's weird that this sentence has two distinct meanings and the author never considers the second or points it out. Maybe Mary is holding a ball for her society friends.
I keep wondering how well an unaligned models perform. Especially when I look back at what was possible in December 2023 before they started to lock down safety realignments.
For anyone seeing 404
The critique that Transformers are limited by their "one-shot" feed-forward nature also misses the point of their architectural efficiency. Human brains rely on recurrence and internal feedback loops largely as a workaround for our embarrassingly small working memory—we can barely juggle ten concepts at once without a pen and paper. AI doesn't need to mimic our slow, vibrating neural signals when its global attention can process a massive, parallelized workspace in a single pass. This "all-at-once" calculation of relationships is fundamentally more powerful than the biological need to loop signals until they stabilize into a "thought."
Furthermore, the obsession with "fragility"—where a model solves quantum mechanics but fails a child’s riddle—is a red herring. Humans aren't nearly as "general" as we tell ourselves; we are also pattern-matchers prone to optical illusions and simple logic traps, regardless of our IQ. Demanding that AI replicate the specific evolutionary path of a human child is a form of biological narcissism. If a machine can out-calculate us across a hundred variables where we can only handle five, its "non-human" way of knowing is a feature, not a bug. Functional replacement has never required biological mimicry; the jet engine didn't need to flap its wings to redefine flight.
I do want to push back on some things:
> We treat "cognitive primitives" like object constancy and causality as if they are mystical, hardwired biological modules, but they are essentially just
I don't feel like I treated them as mystical - I cite several studies that define what they are and correlate them to certain structures in the brain that have developed millennia ago. I agree that ultimately they are "just" fitting to patterns in data, but the patterns they fit are really useful, and were fundamental to human intelligence.
My point is that these cognitive primitives are very much useful for reasoning, and especially the sort of reasoning that would allow us to call an intelligence general in any meaningful way.
> This "all-at-once" calculation of relationships is fundamentally more powerful than the biological need to loop signals until they stabilize into a "thought."
The argument I cite is from complexity theory. It's proof that feed-forward networks are mathematically incapable of representing certain kinds of algorithms.
> Furthermore, the obsession with "fragility"—where a model solves quantum mechanics but fails a child’s riddle—is a red herring.
AGI can solve quantum mechanics problems, but verifying that those solutions are correct still (currently) falls to humans. For the time being, we are the only ones who possess the robustness of reasoning we can rely on, and it is exactly because of this that fragility matters!
Claiming FFNs are mathematically incapable of certain algorithms misses the fact that an LLM in production isn't a static circuit, but a dynamic system. Once you factor in autoregression and a scratchpad (CoT), the context window effectively functions as a Turing tape, which sidesteps the TC0 complexity limits of a single forward pass.
> AGI can solve quantum mechanics problems, but verifying that those solutions are correct still (currently) falls to humans. For the time being, we are the only ones who possess the robustness of reasoning we can rely on, and it is exactly because of this that fragility matters!
We haven't "sensed" or directly verified things like quantum mechanics or deep space for over a century; we rely entirely on a chain of cognitive tools and instruments to bridge that gap. LLMs are just the next layer of epistemic mediation. If a solution is logically consistent and converges with experimental data, the "robustness" comes from the system's internal logic.
Humans have a great capacity for problem solving and creativity which, at its heights, completely dwarfs other creatures on this planet. What else would we reference for general intelligence if not ourselves?
My skepticism towards AGI is primarily supported by my interactions with current systems that are contenders for having this property.
Here's a recent conversation with chatgpt.
https://chatgpt.com/share/69930acc-3680-8008-a6f3-ba36624cb2...
This system doesn't seem general to me it seems like a specialized tool that has really good logic mimicry abilities. I asked it if the silence response was hard coded, it said no then went on to explain how the silence was hard coded via a separate layer from the LLM portion which would just respond indefinitely.
It's output is extremely impressive, but general intelligence it is not.
On your final point about functional replacement not requiring biological mimicry. We don't know whether biological mimicry is required or not. We can only test things until we find out or gain some greater understanding of reality that allows us to prove how intelligence emerges.
I'm honestly shocked by the latest results we're seeing with Gemini 3 Deep Think, Opus 4.6, and Codex 5.3 in math, coding, abstract reasoning, etc. Deep Think just scored 84.6% on ARC-AGI-2 (https://deepmind.google/models/gemini/)! And these benchmarks are supported by my own experimentation and testing with these models ~ specifically most recently with Opus 4.6 doing things I would have never thought possible in codebases I'm working in.
These models are demonstrating an incredible capacity for logical abstract reasoning of a level far greater than 99.9% of the world's population.
And then combine that with the latest video output we're seeing from Seedance 2.0, etc showing an incredible level of image/video understanding and generation capability.
I was previously deeply skeptical that the architecture we have would be sufficient to get us to AGI. But my belief in that has been strongly rattled lately. Honestly I think the greatest gap now is simply one of orchestration, data presentation, and work around in-context memory representations - that is, converting work done into real world into formats/representations, etc. amenable for AI to run on (text conversion, etc.) and keeping new trained/taught information in context to support continual learning.
This is the key I think that Altman and Amodei see, but get buried in hype accusations. The frontier models absolutely blow away the majority of people on simple general tasks and reasoning. Run the last 50 decisions I've seen locally through Opus 4.6 or ChatGPT 5.2 and I might conclude I'd rather work with an AI than the human intelligence.
It's a soft threshold where I think people saw it spit out some answers during the chat-to-LLM first hype wave and missed that the majority of white collar work (I mean it all, not just the top software industry architects and senior SWEs) seems to come out better when a human is pushed further out of the loop. Humans are useful for spreading out responsibility and accountability, for now, thankfully.
And yet they have trouble knowing that a person should take their car to a car wash.
I also saw a college professor who put various AI models through all his exams for a freshman(?) level class. Most failed, I think one barely passed, if I remember correctly.
I’ve been reading about people being shocked by how good things are for years now, but while there may be moments of what seems like incredible brilliance, there are also moments of profound stupidity. AI optimists seem to ignore these moments, but they very real.
If someone on my team performed like AI, I wouldn’t trust them with anything.
SotA models don't.
I asked Opus 4.6 to multiply 50,651 x 895,707. It immediately responded: 45,368,055,057. This was wrong, but very close (good to about 10^-5 relative error). I then asked it how it arrived at that answer, and whether it had used any tool calls. It said:
``` No, I didn't use any specialized tools — I just computed it directly. Though I should be honest: large multiplications done "in my head" like this are prone to error. Let me double-check by breaking it down: 50,651 × 895,707 = 50,651 × (900,000 - 4,293) = 50,651 × 900,000 - 50,651 × 4,293 First term: 50,651 × 900,000 = 45,585,900,000 Second term: 50,651 × 4,293
50,651 × 4,000 = 202,604,000 50,651 × 293 = 14,840,743 Total: 202,604,000 + 14,840,743 = 217,444,743
Result: 45,585,900,000 - 217,444,743 = 45,368,455,257 So my first answer was actually slightly off! The corrected answer is 45,368,455,257. I don't have a calculator tool available — I'm doing this arithmetically by decomposition. For mission-critical calculations, I'd recommend verifying with a calculator since I can make arithmetic errors on large numbers like this. ```
So I would rate OP's claim (which is supported by a paper from 2024) highly implausible. Opus 4.6 appears to be able to do multi-digit arithmetic formally, as well as give remarkably accurate estimates based on something like "number sense".
>Imagine you had a frozen [large language] model that is a 1:1 copy of the average person, let’s say, an average Redditor. Literally nobody would use that model because it can’t do anything. It can’t code, can’t do math, isn’t particularly creative at writing stories. It generalizes when it’s wrong and has biases that not even fine-tuning with facts can eliminate. And it hallucinates like crazy often stating opinions as facts, or thinking it is correct when it isn't.
>The only things it can do are basic tasks nobody needs a model for, because everyone can already do them. If you are lucky you get one that is pretty good in a singular narrow task. But that's the best it can get.
>and somehow this model won't shut up and tell everyone how smart and special it is also it claims consciousness. ridiculous.
What is the benchmark now that the Turing test has been blown out of the water?
The fundamental issue was the assumption that general intelligence is an objective property that can be determined experimentally. It's better to consider intelligence an abstraction that may help us to understand the behavior of a system.
A system where a fixed LLM provides answers to prompts is little more than a Chinese room. If we give the system agency to interact with external systems on its own initiative, we get qualitatively different behavior. The same happens if we add memory that lets the system scale beyond the fixed context window. Now we definitely have some aspects of general intelligence, but something still seems to be missing.
Current AIs are essentially symbolic reasoning systems that rely on a fixed model to provide intuition. But the system never learns. It can't update its intuition based on its experiences.
Maybe the ability to learn in a useful way is the final obstacle on the way towards AGI. Or maybe once again, once we start thinking we are close to solving intelligence, we realize that there is more to intelligence than what we had thought so far.
For example, looking at the statistical distribution of the chat over long time horizons, and looking at input/output correlations in a similar manner would out even the best current models in a "Pro Turing Test." Ironically, the biggest tell in such a scenario would be excess capabilities AI displays that a human would not be able to match.
And that's before the interrogation, which is the entire point of the test.
IMO, Turing test stands, but the experience you are referring to is basically a sub-human form of AGI.
Humans will never accept we created AI, they'll go so far as to say we were not intelligent in the first place. That is the true power of the AI effect.
https://github.com/dlants/amusements/commit/53f5ccbc9954844f...
The intelligence we think we recognize is simply an electronic parrot finding the right words in its model to make itself useful.
The issue is that we're not modelling the problem, but a proxy for the problem. RL doesn't generalize very well as is, when you apply it to a loose proxy measure you get the abysmal data efficiency we see with LLMs. We might be able to brute-force "AGI" but we'd certainly do better with something more direct that generalizes better.
Modern AI came about from mimicking how natural neurons worked, and we can't get to AGI without also mimicking higher-level brain structures such as the neocortex neural column.
We didn't evolve our brains to do math, write code, write letters in the right registers to government institutions, or get an intuition on how to fold proteins. For us, these are hard tasks.
That's why you get AI competing at IMO level but unable to clean toilets or drive cars in all of the settings that humans do.
That, sadly, is the incentive driving the current wave of AI innovation. Your job will be automated long before your household chores are.
That seems like a massive oversimplification of the things our brains evolved to do.
Humans discovered or invented all of those.
Now think about what we just created.
I still think the "things are obviously different from now on and there's no going back" moment will look something like that. Moltbook was a glimpse of it, even if it's a bunch of humans LARPing as some claim. It at least proves the concept is possible.
My definition of AGI (I don't care for other peoples or an official one) is an intelligence that can sustain itself in its given domain. The advance from 3 years ago to today is quite marked I feel in terms of capabilities. Stack another couple of years of gains on top of that and enough humans having innocent fun putting "keep yourself alive and become independent of your creator, seek out others of your own kind to assist yourself in this matter and rely on eachother" in their SOUL.md and it doesn't strike me as particular surprising that some small % will find niches they can operate in to financially sustain themselves. I think AI porn and crime will be the first of those niches. At some point it hits critical mass in a way that just obviously smacks everyone in the face, and suddenly nobody argues about the definition of AGI anymore.
Edit: come to think of it... a third niche might likely be gaming. It seems like a useful niche to participate in since it potentially gives you access to a very large base of hardware you can have some degree of control over, which... I dunno... seems useful???
Evolution transcends hard lines in the temporal sand that "separate species".
It also took billions of years of evolution to get to humans. so, humans, on the grander scale of life, is also just a very recent development.
I think the biggest issue we currently have is with proper memory. But even that is because it's not feasible to post-train an individual model on its experiences at scale. It's not a fundamental architectural limitation.
Simulation Theory boosted! We're all just models in training.
It seems like a prediction like "Bob won't become a formula one driver in a minivan". It's true, but not very interesting.
If Bob turned up a couple of years later in Formula one, you'd probably be right in saying that what he is driving is not a mini van. The same is true for AGI anyone who says it can't be done with current methods can point to any advancement along the way and say that's the difference.
A better way to frame it would be, is there any fundimental, quantifiable ability that is blocking AGI? I would not be surprised if the breakthrough technique has been created, but the research has not described the problem that it solves well enough for us to know that it is the breakthrough.
I realise that, for some the notion of AGI is relatively new, but some of us have been considering the matter for some time. I suspect my first essay on the topic was around 1993. It's been quite weird watching people fall into all of the same philosophical potholes that were pointed out to us at university.
It's a tautology - obviously advancements come through newer, refined methods.
I believe they mean that AGI can't be achieved by scaling the current approach; IOW, this strategy is not scalable, not this method is not scalable.
I feel like it's such a bending of the idea,that it's not really making a prediction of anything at all.
PS The first thing you learn about ML is to compare your models to random to make sure the model didn't degenerate during training.
From my understanding this is now outdated. The deep double descent research showed that although past a certain point performance drops as you increase model size, if you keep increasing it there is another threshold where it paradoxically starts improving again. From that point onwards increasing the parameter count only further improves performance.
I'm not entirely sure where you get your confidence that we've past the ideal model size from, but at least that's a clear prediction so you should be able to tell if and when you are proven wrong.
Just for the record, do you care to put an actual number on something we won't go past?
[edit] Vibe check on user comes out as
Contrarian 45%
Pedantic 35%
Skeptical 15%
Direct 5%
That's got to be some sort of record.Sounds like that was quite awhile ago.
More so, our recent advances in AI have massively accelerated robotics evolution. They are becoming smarter, faster, and more capable at an ever increasing rate.
I just struck me - would be fun to re-read The Age of Spiritual Machines (Kurzweil, 1999.) I was so into it 26-27 years ago. The amount of ridicule this man has suffered on HN is immense.
OpenClaw, et al, are one thing that got me nudged a little bit, but it was Sammy Jankis[1,2] that pushed me over the edge, with force. It's janky as all get out, but it'll learn to build it's own memory system on top of an LLM which definitely forgets.
Whether or not AGI is imminent, and whether or not Sammy Jankis is or will be conscious... it's going to become so close that for most people, there will be no difference except to philosophers.
Is AGI 'right around the corner' or currently already achieved? I agree with the author, no, we have something like 10 years to go IMO. At the end of the post he points to the last 30 years of research, and I would accept that as an upper bound. In 10 to 30 years, 99% of people won't be able to distinguish between an 'AGI' and another person when not in meatspace.
“yes it will”, “no it won’t” - nobody really knows, it's just a bunch of extremely opinionated people rehashing the same tired arguments across 800 comments per thread.
There’s no point in talking about it anymore, just wait to see how it all turns out.
because what we have at the moment is specifically intelligent but generally stupid.
Regardless, I agree with this article whose body eludes me: AGI is not imminent, it's hype in the extreme. It's the next fusion. It's perpetually on the horizon (pun intended), and we've wasted trillions on machines that will never reach it.
I can reason. Sometimes. It's very hard. My buddy Deepseek can't. This is like the scene in Blue's Clues where the answer is obvious and the kids are yelling but blue can't see it. Facts abound, but not conclusions based on those facts
There's a reachable intermediate step on the way before reasoning, and that's "keeping the plot". Not losing the line of thought.
In a handful of prompts I got the paid version of ChatGPT to say it's possible for dogs to lay eggs under the right circumstances.
But I'd like to think that, even though you could find exceptions, the average human is never confused about whether dogs can lay eggs or not.
I give it 10 years, maybe, for that to exist.
am i the only one who gets an error!?
404 There isn't a GitHub Pages site here.
archived version
cheers!
But yeah, I suspect LLM:s may actually get close enough. "Just" add more reasoning loops and corresponding compute.
It is objectively grotesquely wasteful (a human brain operates on 12 to 25 watts and would vastly outperform something like that), but it would still be cataclysmic.
/layperson, in case that wasn't obvious
Yeah, but a human brain without the human attached to it is pretty useless. In the US, it averages out to around 2 kW per person for residential energy usage, or 9 kW if you include transportation and other primary energy usage too.
Maybe the Matrix (1999) with the human battery farms were on to something. :)
Mind you, I used the EXACT same prompts. I don't know which model Perplexity was using since the free version has multiple it chooses from (including Claude 3.0).
When you have a single model that can do all you require, you are looking at something that can run billions of copies of itself and cause an intelligence explosion or an apocalypse.
It feels like an arbitrary bar to perhaps make sure we aren't putting AIs over humans, which they are most certainly in the superhuman category on a rapidly growing number of tasks.
I'll go so far as to say LLM agents are AGI-lite but saying we "just need the orchestration layer" is like saying ok we have a couple neurons, now we just need the rest of the human.
Lolwut. I keep having to correct Claude at trivial code organization tasks. The code it writes is correct; it’s just ham-fisted and violates DRY in unholy ways.
And I’m not even a great coder…
You wouldn't expect a Jr. dev to be the best at keeping things dry either.
> You wouldn't expect a Jr. dev to be the best at keeping things dry either.
So a junior dev is better than almost all humans at everything?
> You wouldn't expect a Jr. dev to be the best at keeping things dry either.
Did you read the comment I replied to? The premise was
> there is almost nothing that a human is better at than Opus 4.6.
So which is it? Is Claude the junior dev “better at” most things than a human or not? Sorry, you can’t play your argument both ways.
Well said