Or the long version: "something about which no conclusions can be drawn because the proposed definitions lack sufficient precision and completeness."
Or the short versions: "Skippetyboop," "plipnikop," and "zingybang."
There are lots of meaningful definitions, the people saying we haven't reached AGI just don't use them. For most of the last half-century people would have agreed that machines that can pass the Turing test and win Math Olympiad gold are AGI.
We’ll know AGI when we see it, and this ain’t it. This complaining about changing goalposts is so transparently sour grapes from people over-invested in hyping the current LLM paradigm.
Thinking cool and all but not that extraordinary. Even plants does it.
The same problem exists defining human intelligence, it's a problem with "intelligence" in general, artificial or not.
I was thinking something similar - this isn't AI, and none of "those people" care if it is or isn't. They don't care philosophically, or even pragmatically.
They're selling a product. That product is the IDEA of replacement of the majority of human labor with what's basically slave labor but with substantially disregardable ethical quandaries.
It's honestly a genius product. I'm not surprised it's selling so well. I'm vaguely surprised so many people who don't stand to benefit in any way shape or form, or who will even potentially starve if it works out, are so keen on it. But there are always bootlickers.
The most unfortunate part is that when the party ends, it's none of "those people" who will suffer even in the slightest. I'm not even optimistic their egos will suffer, as Musk seems to show they are utterly immune even as their companies collapse under them.
"it is AGI when we can no longer come up with tasks easy for humans to solve but hard for computers"
> artificial general intelligence (AGI)—by which we mean highly autonomous systems that outperform humans at most economically valuable work
"Highly autonomous systems" and "most economically valuable work" aren't precise enough to be useful.
"Highly" implies that there is a continuum, so where does directed end and autonomy begin?
"Most economically valuable work"... each word in that has wiggle room, not to mention that any reasonable interpretation of it is a shifting goalpost as the work done by humans over history has shifted a great deal.
The point is that none of this is defined in a way so that people can agree that something has AGI/ASI/etc. or not. If people can't agree then there's no point in talking about it.
EDIT: interestingly, the OpenAI definition of AGI specifically means that a subset of humans do not have AGI.
The definition reminds me of the common quip about robotics, "it's robotics when it doesn't work, once it works it's a machine".
If the definition has shifted once again to mean "a computer program that does a task pretty well for us", then what's the new term we're using to define human-level artificial intelligence?
Is doing a ton of heavy lifting. What is considered economically valuable work is going to change from decade to decade, if not from year to year. What’s considered economically valuable also is going to be way different depending across individuals and nations within the exact same time frames too.
And I get that there are workarounds; effectively a cron job every second prompting "do the next thing".
But in my personal definition of "highly autonomous" it would not need prompting at all. It would be thinking all the time, independently of requests.
what does "economically" means here? would it cover teaching? child care? healthcare? etc.
If we define AGI as an AI not doing a preset task but can be used for general purpose, then we already have that. If we define it as human level intelligence at _every_ task, then some humans fail to be an AGI. If we define AGI as a magic algorithm that does every task autonomously and successfully then that thing may not exist at all, even inside our brains.
When the AGI term was first coined they probably meant something like HAL 9000. We have that now (and HAL gaining self-awareness or refusing commands are just for dramatic effect and not necessary). Goalposts are not stable in this game.
These days it is neural networks and transformer models for language in particular that people mean when they say unqualified AI.
It is very hard to have a meaningful discussion when different parties mean different things with the same words.
All humans fail to be AGI, by definition.
Yes.
This will never happen, LLMs are already being used very unsafely, and if this HN headline stays where it is OpenAI will quietly remove their charter from their website.
The important context I think people may miss is, this does not require AI to be 10x or 5x or even 1x as good as a human programmer. Claude is worse than me in meaningful ways at the kind of code I need to write, but it’s still doing almost all my coding because after 4.6 it’s smart enough to understand when I explain what program it should have written.
'If you actually know what models are doing under the hood to product output that...'
Any one that tells you they know 'what models are dong under the hood' simply has no idea what they're talking about, and it's amazing how common this is.
I also think so, and in the meantime I have to admit a lot of people don't learn deeply either. Take math for example, how many STEM students from elite universities truly understood the definition of limit, let alone calculus beyond simple calculation? Or how many data scientists can really intuitively understand Bayesian statistics? Yet millions of them were doing their job in a kinda fine way with the help of the stackexchange family and now with the help of AI.
I don't think it was so much the naivety of idealism, but more an adoption of idealism and related language to help market what was actually being built: a profit-first organization that's taking its true form little by little.
You cannot get real, actual AGI (the same ability to perform tasks as a human) without a continuous cycle of learning and deep memory, which LLMs cannot do. The best LLM "memory" is a search engine and document summarizer stuffed into a context window (which is like having someone take an entire physics course, writing down everything they learn on post-it notes, then you ask a different person a physics question, and that different person has to skim all the post-it notes, and then write a new post-it note to answer you). To learn it would need RL (which requires specific novel inputs) and retraining (so that it can retain and compute answers with the learned input). This would all take too much time and careful input/engineering along with novel techniques. So AGI is too expensive, time consuming, and difficult for us to achieve without radically different designs and a whole lot more effort.
Not only are LLMs not AGI, they're still not even that great at being LLMs. Sure, they can do a lot of cool things, like write working code and tests. But tell one "don't delete files in X/", and after a while, it will delete all the files in "X/", whereas a human would likely remember it's not supposed to delete some files, and go check first. It also does fun stuff like follow arbitrary instructions from an attacker found in random documents, which most humans also wouldn't do. If they had a real memory and RL in real-time, they wouldn't have these problems. But we're a long way away from that.
LLMs are fine. They aren't AGI.
The statements of what "actual researchers" are you relying upon for your "next 30 years" estimate? How do you reconcile them with the sub-10- or even sub-5-years timelines of other AI researchers, like Daniel Kokotajlo[1] or Andrej Karpathy[2]? For that matter, what about polls of AI researchers, which usually obtain a median much shorter than 30 years [3]?
[1] https://x.com/DKokotajlo/status/1991564542103662729
[2] https://x.com/karpathy/status/1980669343479509025
[3] https://80000hours.org/2025/03/when-do-experts-expect-agi-to...
Why this discussion is already annoying and poised to get so much worse is because now hundred billion dollar companies have a direct financial incentive to say they did it so I expect the definition will get softened to near meaninglessness so some marketing department can slap AGI on their thing.
Karpathy himself has publicly stated that AGI itself is only possible with a new paradigm (that his group is working toward). He claims RHLF and attention models are near the end of their logarithmic curve. The concept of the "self-training AI" is likely impossible without a new kind of model.
We will likely see some classes of human skills completely taken over by LLMs this decade: call centers (already capable in 2026), SWE (the next couple years). Bear in mind the frontier labs have spend many billions on exhaustive training on every aspect of these domains. They are focusing training on the highest value occupations, but the long tail is huge.
It will be interesting to see if this investment will be obviated by a "real AGI" capable of learning without going through the capital-intensive training steps of current models.
This is the best summary of an LLM that I’ve ever seen (for laypeople to “get it”) and is the first that accurately describes my experience. I will say, usually the notes passed to the second person are very impressive quality for the topic. But the “2nd person” still rarely has a deep understanding of it.
>the same ability to perform tasks as a human
The first chess AIs lost to chess grandmasters. AI does not need to be better than humans to be considered AI.
>without a continuous cycle of learning and deep memory, which LLMs cannot do.
But harnesses like Claude Code can with how they can store and read files along with building tools to work with them.
>which is like having someone take an entire physics course, writing down everything they learn on post-it notes, then you ask a different person a physics question, and that different person has to skim all the post-it notes, and then write a new post-it note to answer you
This don't matter. You could say a chess AI is a bunch of different people who work together to explore distant paths of the search space. The idea you can split things into steps does not disqualify it from being AI.
>But tell one "don't delete files in X/", and after a while, it will delete all the files in "X/"
Humans make mistakes and mess up things too. LLMs are better at needle in a haystack tests than humans.
>It also does fun stuff like follow arbitrary instructions from an attacker
A ton of people get phished or social engineered by attackers. This is the number 1 way people get hacked. Do not underestimate people's willingness to follow instructions from strangers.
They can reason brilliantly within a single conversation — just like an amnesic patient can hold an intelligent discussion — but the moment the session ends, everything is gone. No learning happened. No memory formed.
What's worse, even within a session, they degrade. Research shows that effective context utilization drops to <1% of the nominal window on some tasks (Paulsen 2025). Claude 3.5 Sonnet's 200K context has an effective window of ~4K on certain benchmarks. Du et al. (EMNLP 2025) found that context length alone causes 13-85% performance degradation — even when all irrelevant tokens are removed. Length itself is the poison.
This pattern is structurally identical to what I see in clinical practice every day. Anxiety fills working memory with background worry, hallucinations inject noise tokens, depressive rumination creates circular context that blocks updating. In every case, the treatment is the same: clear the context. Medication, sleep, or — for an LLM — a fresh session.
The industry keeps betting on bigger context windows, but that's expanding warehouse floor space while the desk stays the same size. The human brain solved this hundreds of millions of years ago: store everything in long-term memory, recall selectively when needed, consolidate during sleep, and actively forget what's no longer useful.
We can build the smartest single model in the world — the greatest genius humanity has ever seen — but a genius with no memory and no sleep is still just an amnesic savant. The ceiling isn't intelligence. It's architecture.
If we can crack long memory we're most of the way there. But you need RL in addition to long memory or the model doesn't improve. Part of the genius of humans is their adaptability. Show them how to make coffee with one coffee machine, they adapt to pretty much every other coffee machine; that's not just memory, that's RL. (Or a simpler example: crows are more capable of learning and acting with memory than an LLM is)
Currently the only way around both of these is brute-force (take in RL input from users/experiments, re-train the models constantly), and that's both very slow and error-prone (the flaws in models' thinking comes from lack of high-quality RL inputs). So without two major breakthoughs we're stuck tweaking what we got.
Sonnet 3.5 is old hat, and today's Sonnet 4.6 ships with an extra long 1M context window. And performs better on long context tasks while at it.
There are also attempts to address long context attention performance on the architectural side - streaming, learned KV dropout, differential attention. All of which can allow LLMs to sustain longer sessions and leverage longer contexts better.
If we're comparing to wet meat, then the closest thing humans have to context is working memory. Which humans also get a limited amount of - but can use to do complex work by loading things in and out of it. Which LLMs can also be trained to do. Today's tools like file search and context compression are crude versions of that.
All those "it's like ..." are faulty – "post-it notes" are not 3k pages of text that can be recalled instantly in one go, copied in fraction of a second to branch off, quickly rewritten, put into hierarchy describing virtually infinite amount of information (outside of 3k pages of text limit), generated on the fly in minutes on any topic pulling all information available from computer etc.
Poor man's RL on test time context (skills and friends) is something that shouldn't be discarded, we're at 1M tokens and growing and pogressive disclosure (without anything fancy, just bunch of markdowns in directories) means you can already stuff-in more information than human can remember during whole lifetime into always-on agents/swarms.
Currently latest models use more compute on RL than pre-training and this upward trend continues (from orders of magnitude smaller than pre-training to larger that pre-training). In that sense some form of continous RL is already happening, it's just quantified on new model releases, not realtime.
With LoRA and friends it's also already possible to do continuous training that directly affects weights, it's just that economy of it is not that great – you get much better value/cost ratio with above instead.
For some definitions of AGI it already happened ie. "someboy's computer use based work" even though "it can't actually flip burgers, can it?" is true, just not relevant.
ps. I should also mention that I don't believe in "programmers loosing jobs", on the contrary, we will have to ramp up on computational thinking large numbers of people and those who are already verse with it will keep reaping benefits – regardless if somebody agrees or not that AGI is already here, it arrives through computational doors speaking computational language first and imho this property will be here to stay as it's an expression of rationality etc
The human eye processes between 100GB and 800GB of data per day. We then continuously learn and adapt from this firehose of information, using short-term and long-term memory, which is continuously retrained and weighted. This isn't "book knowledge", but the same capability is needed to continuously learn and reason on a human-equivalent level. You'd need a supercomputer to attempt it, for a single human's learning and reasoning.
RL is used for SOTA models, but it's a constant game of catch-up with limited data and processing. It's like self-driving cars. How many millions of miles have they already captured? Yet they still fail at some basic driving tasks. It's because the cars can't learn or form long-term memories, much less process and act on the vast amount of data a human can in real time. Same for LLM. Training and tweaking gets you pretty far, but not matching humans.
> With LoRA and friends it's also already possible to do continuous training that directly affects weights, it's just that economy of it is not that great
And that means we're stuck with non-AGI. Which is fine! We could've had flying cars decades ago, but that was hard, expensive and unnecessary, so we didn't do that. There's not enough money in the global economy to "spend" our way to AGI in a short timeframe, even if we wanted to spend it all, even if we could build all the datacenters quickly enough, which we can't (despite being a huge nation, there are many limitations).
> For some definitions of AGI
Changing the goalposts is dangerous. A lot of scary real-world stuff is hung on the idea of AGI being here or not. People will keep getting more and more freaked out and acting out if we're not clear on what is really happening. We don't have AGI. We have useful LLMs and VLMs.
> it will delete all the files in "X/"
How many "I deleted the prod database" stories have you seen? Humans do this too.
> follow arbitrary instructions from an attacker found in random documents
This is just the AI equivalent of phishing - inability to distinguish authorized from unauthorized requests.
Whenever people start criticizing AI, they always seem to conveniently leave out all the stupid crap humans do and compare AI against an idealized human instead.
If you've used the latest models extensively, you must've noticed times when AI 'runs out of common sense' and keeps trying stupid stuff.
I'm somewhat convinced that the amazing (and improving!) coding ability of these LLMs comes from it being RLHFd on the conversations its having with programmers, with each successfully resolved bug, implemented feature ending up in training data.
Thus we are involuntarily building the world's biggest stackoverflow.
Which for the record is incredibly useful, and may even put most programmers out of a job (who I think at that point should feel a bit stupid for letting this happen), but its not necessarily AGI.
Which fundamental limitation do you mean? I haven't seen anything but slow, iterative improvements. Sure, if feels fine, turtle can eventually do 10,000 mile trek but just because its moving left and right feet and decreasing the distance doesn't mean its getting there anytime soon.
Parent mentioned way harder hurdles than iterative increments can tackle, rather radical new... everything.
Humans generally do it on accident. They don't preface it with "Let me delete the production database," which LLMs do.
Humans do it accidentally.
The boundaries of these systems is very easy to find, though. Try to play any kind of game with them that isn't a prediction game, or perhaps even some that are (try to play chess with an LLM, it's amusing).
Eh? Which limitations were solved?
I disagree that this prerequisite is more necessary than e.g. having legs to move over the ground. But besides that, current LLMs are literally a result of the continuous cycle of learning and deep memory. It's pretty crude compared to what evolution and human process had to do, but that's precisely how the iterative model development cycle with the hierarchical bootstrap looks like. It's not fully autonomous though (engineer-driven/humans in the loop). Moreover, the distillation process you describe is precisely what "learning" is.
Have you seriously never had someone to go do something you told them not to do?
> It also does fun stuff like follow arbitrary instructions from an attacker found in random documents, which most humans also wouldn't do.
I guess my coworker didn't actually fall for that "hey this is your CEO, please change my password" WhatsApp message then, phew.
I've seen people move the goalposts on what it means for AI to be intelligent, but this is the first time I've seen someone move the goalposts on what it means for humans to be intelligent.
So that leads to the question of what qualifies as intelligent? And do we need sentience for intelligence? What about self-agency/-actuation? Is that needed for "generally intelligent"?
I don't know.
But I feel like we're not there yet, even for non-sentient intelligence. I personally think we need an "unlimited" context (as good as human memory context windows anyway, which some argue we've already surpassed) and genuine self-learning before we get close. I don't think we need it to be an infallible genius (i.e ASI) to qualify as generally intelligent ... or to put it another way "about as smart and reliable as the average human adult" which frankly is quite a low bar!
One thing for sure though, I think this will creep up on us and one day it will suddenly become apparent that it's already there and we just didn't appreciate/notice/comprehend. There won't be a big fireworks display the moment it happens, more of a creeping realisation I think.
I give it 5 years +/-2.
Gave the same prompt to GPT 5.4 (high) and Opus 4.6 (high).
GPT 5.4 implemented the feature, refactored the code (was not asked to), removed comments that were not added in that session, made the code less readable, and introduced a bug. "Undo All".
Opus 4.6 correctly recognized that the feature is already implemented in the current code (yeah, lol) and proposed implementing tests and updating the docs.
Opus 4.6 is still the best coding agent.
So yeah, GPT 5.4 (high) didn't even check if the feature was already implemented.
Tried other tasks, tried "medium" reasoning - disappointment.
I am to sure one can really extrapolate much out of that, but I do find it interesting nonetheless.
I think language is also an important factor. I have a hard time deciding which of the two LLMs is worse at Swift, for example. They both seem equally great and awful in different ways.
Funny how timely this is, with Karpathy's Autoresearch hitting the top of HN yesterday (and this being an indication that frontier labs probably have much larger scale versions of this)
> Achieving AGI, he conceded, will require “a lot of medium-sized breakthroughs. I don’t think we need a big one.”
> At the Snowflake Summit in June 2025, Altman predicted that 2026 would mark a breakthrough when AI systems begin generating “novel insights” rather than simply recombining existing information. This represents a threshold he considers critical on the path to AGI.
Though I'm sure they'll try to change the charter before we get to that point, but yeah.
Which such project is that, though? And would it accept OpenAI's assistance?
AGI, having access to our world, is precarious as alignment with humans is never guaranteed. Having a buffering medium, aka a simulation environment where AI operates might be a better in-between solution.
A great point. I saw blinding idealism during the early days of GPT era.
This doesn't seem contradictory if you consider that success at AGI will solve the problem of carbon emissions, one way or another. If one data center ultimately replaces a whole medium-sized city of commuters...
“The changing goalposts of AGI and timelines. Notably, it’s common to now talk about ASI instead, implying we may have already achieved AGI, almost without noticing.”
Amen
"Therefore, if a value-aligned, safety-conscious project comes close to building AGI before we do, we commit to stop competing with and start assisting this project."
I claim that currently no "value-aligned, safety-conscious project comes close to building AGI", both for the reasons
- "value-aligned, safety-conscious" and
- "close to building AGI".
So, based on this charter, OpenAI has no reason to surrender the race.
The tried and true technique of speeding up completion of an established project by adding more people; well known for the tendency to fail successfully.
Are you sure Anthropic isn't aware of this and angling for this? And are you sure what Anthropic say is really value-aligned and safety concious? The PR bit surely is working right?
Even the quote they used questions the premise of the article
> “We basically have built AGI” (later: “a spiritual statement, not a literal one”)
Imho that's a big part of why people are shifting to ASI. Not because we reached AGI, but because 'we reached ASI' is a well-defined verifiable statement, where 'we reached AGI' just isn't
I remember when computers became better than humans at chess, many people were shocked and saw that as machines becoming more intelligent than humans. Because being good at chess what considered equivalent to "being smart".
So... we can't tell when the rocket has left Earth atmosphere, but we can tell when the rocket has entered space?
I'm not getting how "superior in all tasks" is better-defined for you than "equal in all tasks".
I completely agree. We can't even measure each other well, let alone machines.
> It can be debated whether arena.ai is a suitable metric for AGI, a strong case can probably be made for why it’s not. However, that’s irrelevant, as the spirit of the self-sacrifice clause is to avoid an arms race, and we are clearly in one.
No, the spirit is clearly meant for near AGI and we aren’t near AGI
The "S" stands for Safety.
Laws & regulations that needs to be created to reign in AI will undoubtedly increase the opportunity cost of training LLMs.
For some, it might be similar to the early 2000s, but I think it's just a healthy rebalance of what AI is, and how the society needs to implement this new, hardly controllable, paradigm. With this perspective, OpenAI has a lot to lose as it hasn't been able to create a moat for itself compared to, let's say, Anthropic.
Some of the apps made possible by smartphones only appeared a decade after they were made technically possible. A lot of the new use cases made possible by the Internet and broadband connections only became widely used because of Covid.
I was already using Skype 20 years ago to make video calls, but I've only seen PTA meetings over Zoom since Covid.
I guess what I failed to convey in my original comment was that, like the Internet 20 years ago, the current advancement made by AI might stall at a foundational level, while the landscape evolves.
Essentially, I believe what you're saying is really close in spirit to what I'm saying.
I'll eat my hat after I sell you a bridge.
previous title: Based on its own charter, OpenAI should surrender the race
(The article itself strikes me as better than that)
> It can be debated whether arena.ai is a suitable metric for AGI, a strong case can probably be made for why it’s not. However, that’s irrelevant, as the spirit of the self-sacrifice clause is to avoid an arms race, and we are clearly in one.
> Therefore, one can only conclude, that we currently meet the stated example triggering condition of “a better-than-even chance of success in the next two years”. As per its charter, OpenAI should stop competing with the likes of Anthropic and Gemini, and join forces, however that might look like.
The new title is a single, almost throwaway, line from the article.
> While this will never happen, I think it’s illustrative of some great points for pondering:
> The impotence of naive idealism in the face of economic incentives. The discrepancy between marketing points and practical actions. The changing goalposts of AGI and timelines. Notably, it’s common to now talk about ASI instead, implying we may have already achieved AGI, almost without noticing.
This does not work. From the guidelines:
> Please don't post on HN to ask or tell us something. Send it to hn@ycombinator.com.
And that's it.
Everything beyond that is nuance.
Nuance matters, but it's not the real story, it's the side show.
- we are building Open AI - only if you have more than $10B net worth
- we are against using AI for military purposes - except when that case is allowed by government
- we are on a mission to help humanity - again, we define humanity as set of people with more than $10B net worth
- surrender? - sure, sure, we will, only to people with more than $10B net worth, they can do whatever they want to our models, we will surrender to them
Is this really the garbage that should lead humanity to our future? Because inevitably that will be a dark future for 99.99% of the humans. And no, you and you won't be part of that 0.01% or whatever tiny number of elite think they are better than rest of us.
This is all just very naive
The point you're desperately trying to miss is that most other companies don't put up those moral claims in the first place
If big corporations do things that are unethical, they should be called out, even if they're common. Saying "well everyone's doing it", isn't a good excuse to do things that are unethical.
It's not "naive" to point out the lies that OpenAI told to get to the point that they are now. They were claiming to be a non-profit for awhile, they grew in popularity based in part on that early good-will, and now they are a for-profit company looking to IPO into one of the most valuable corporations on the planet. That's a weird thing. That's a thing that seems to be kind of antithetical to their initial purpose. People should point that out.
- Caitlin Kalinowski, previously head of robotics at OpenAI
https://www.linkedin.com/posts/ckalinowski_i-resigned-from-o...
But I am trying to understand this from the perspective of defence & govt. Why is it so business as usual for them? Do they consider this at par with missiles with infra-red/heat sensors for tracking/locking? Where does the definition of lethal autonomy begin and end?
Just putting this out there as a point to ponder on. By itself, this may rightly be too broad and should be debated.
On its face that’s not a crazy stance: Governments are meant to represent the public, while private companies obviously aren't. I think it’s somewhat understandable why the government might reject that kind of "we know better than you" type of clause.
Of course, the reaction is wildly out of proportion. A normal response would just be to stop doing business with the company and move on. Labeling them a supply chain risk is an extreme response.
If you're one of the contractors working in NRO or aware of Sentient, OpenAI and Anthropic probably do look like supply chain risks. They want to subsume the work you're already doing with more extreme limitations (ones that might already be violated). So now you're pitching backup service providers, analyzing the cost of on-prem, and pricing out your own model training; it would be really convenient if OpenAI just agreed to terms. As a contractor, you can make them an offer so good that it would be career suicide to refuse it.
Autonomous weapons are a horse of a different color, but it's safe to assume the same discussions are happening inside Anduril et. al.
https://www.vp4association.com/aircraft-information-2/32-2/m...
A less charitable interpretation is that the current doctrine is "China / Russia will build autonomous killbots, so we can't allow a killbot gap".
I'm frankly less concerned about "proper" military uses than I am about the tech bleeding into the sphere of domestic law enforcement, as it inevitably will.
Hum...
The one thing domestic surveillance enables is defining targets inside the country, and the one thing lethal autonomy enables is executing targets that a soldier would refuse to.
Those things don't have other uses.
A 2017 national intelligence law compels Chinese companies and individuals to cooperate with state intelligence when asked and without and public notice.
China has no equivalent of the whistleblower protection that enables resignations with public letters explaining why, protests, open letters with many signatures, etc. Whenever you see "Chinese whistleblower" in the news, you're looking at someone who quietly fled the country first and then blew the whistle. Example: https://www.cnn.com/2026/02/27/us/china-nyc-whistleblower-uf...
Why do I not believe this at all? Were things truly sunshine and roses at OpenAI up until this Pentagon debacle? Perhaps I am mistaken, but it seemed like the writing was on the wall years ago.
> I have deep respect for Sam and the team
I have even more questions now.
the whole public debacle was planned, the tos isn't stopping the pentagon from doing anything (as we seen with openai now)
>claims to be some topshot data scientist
okay
One can argue that they have already achieved this. At least for short termed tasks. Humans are still better at organization, collaboration and carrying out very long tasks like managing a project or a company.
No, because they're hugely reliant on their training data and can't really move beyond their training data. This is why you haven't seen an explosion of new LLM-aided scientific discoveries, why Suno can't write a song in a new genre (even if you explain it to Suno in detail and give it actual examples,) etc.
This should tell you something enormous about (1) their future potential and (2) how their "intelligence" is rooted in essentially baseline human communications.
Admittedly LLMs are superhuman in the performance of tasks which are, for want of a better term, "conventional" -- and which are well-represented in their training data.
I don’t even think humans can “move beyond” their sensory data. They generalize using it, which is amazing, but they are still limited by it.* So why is this a reasonable standard for non-biological intelligence?
We have compelling evidence that both can learn in unsupervised settings. (I grant one has to wrap a transformer model with a training harness, but how can anyone sincerely consider this as a disqualifier while admitting that an infant cannot raise itself from birth!)
I’m happy to discuss nuance like different architectures (carbon versus silicon, neurons versus ANNs, etc), but the human tendency to move the goalposts is not something to be proud of. We really need to stop doing this.
* Jeff Hawkins describes the brain as relentlessly searching for invariants from its sensory data. It finds patterns in them and generalizes.
SoTA models are at least very close to AGI when it comes to textual and still image inputs for most domains. In many domains, SoTA AI is superhuman both in time and speed. (Not wrt energy efficiency.*)
AI SoTA for video is not at AGI level, clearly.
Many people distinguish intelligence from memory. With this in mind, I think one can argue we’ve reached AGI in terms of “intelligence”; we just haven’t paired it up with enough memory yet.
* Humans have a really compelling advantage in terms of efficiency; brains need something like 20W. But AGI as a threshold has nothing directly to do with power efficiency, does it?
LLMs can't be swapped in for human workers in general because there are still a lot of things they don't do like learning as they go. So that's missing from the Wikipedia thing.