The changing goalposts of AGI and timelines (opens in new tab)

(mlumiste.com)

404 pointsskandium2mo ago388 comments

388 comments

Anytime I see "Artificial General Intelligence," "AGI," "ASI," etc., I mentally replace it with "something no one has defined meaningfully."

Or the long version: "something about which no conclusions can be drawn because the proposed definitions lack sufficient precision and completeness."

Or the short versions: "Skippetyboop," "plipnikop," and "zingybang."

chrysoprace2mo ago

I've largely avoided using the term "AI" to refer to the current LLM and generative technology because it's loaded with too much ambiguity and glosses over the problems with those technologies in the context of conversations around it.

datsci_est_20152mo ago

Same, but I have used the term “generative AI” to describe generative models. Never the naked “AI” though (except in conversations with friends where the difference is pedantic because they’re not subject matter experts).

parliament322mo ago

"AI" implies intelligence, which is nowhere to be found. "Text generators" is the best descriptive term.

xyzal2mo ago

"applied statistics"

logicchains2mo ago

>Anytime I see "Artificial General Intelligence," "AGI," "ASI," etc., I mentally replace it with "something no one has defined meaningfully."

There are lots of meaningful definitions, the people saying we haven't reached AGI just don't use them. For most of the last half-century people would have agreed that machines that can pass the Turing test and win Math Olympiad gold are AGI.

sebastos2mo ago

Firstly, the models that pass the Math Olympiad aren’t the same models as the ones you’re saying “pass the Turing test”. Secondly, nothing actually passes the Turing test. They pass a vibes check of “hey that’s pretty good!” but if your life depended on it, you could easily find ways to sniff out an LLM agent. Thirdly, none of these models learn in real time, which is an obviously essential feature.

We’ll know AGI when we see it, and this ain’t it. This complaining about changing goalposts is so transparently sour grapes from people over-invested in hyping the current LLM paradigm.

ufmace2mo ago

> nothing actually passes the Turing test

Says who? I had already found this study, published almost a year ago, saying that they do: https://arxiv.org/abs/2503.23674

There doesn't seem to be a super-rigorous definition of the Turing Test, but I don't think it's reasonable to require it to fool an expert whose life depends on the correct choice. It already seems to be decently able to fool a person of average intelligence who has a basic knowledge of LLMs.

I agree that we don't really have AGI yet, but I'd hope we can come up with a better definition of what it is than "we'll know it when we see it". I think it is a legitimate point that we've moved the goalposts some.

6 more replies

tokai2mo ago

Turing test is generally misunderstood, much like Schrodinger's cat, it has devolved in to a pop cultural meme. The test is to evaluate if a machine can think. Not if it is intelligent, not if it is human-like. Its dismissed as a useful by most experts in philosophy of mind, AI, language, etc..

Thinking cool and all but not that extraordinary. Even plants does it.

4 more replies

orbital-decay2mo ago

The most pragmatic definition I know is OpenAI's own: "highly autonomous systems that outperform humans at most economically valuable work". Which is still something between skippetyboop and zingybang, as it leaves a ton of room for OAI to decide if that moment is reached, and also economically valuable work is a moving target.

b00ty4breakfast2mo ago

If fooling people and doing math good are the criteria, we've had AGI for longer than we've had the modern internet.

erichocean2mo ago

> "something about which no conclusions can be drawn because the proposed definitions lack sufficient precision and completeness."

The same problem exists defining human intelligence, it's a problem with "intelligence" in general, artificial or not.

politelemon2mo ago

The enskibidification of AI

atomicnumber32mo ago

Honestly, not enough of a joke.

I was thinking something similar - this isn't AI, and none of "those people" care if it is or isn't. They don't care philosophically, or even pragmatically.

They're selling a product. That product is the IDEA of replacement of the majority of human labor with what's basically slave labor but with substantially disregardable ethical quandaries.

It's honestly a genius product. I'm not surprised it's selling so well. I'm vaguely surprised so many people who don't stand to benefit in any way shape or form, or who will even potentially starve if it works out, are so keen on it. But there are always bootlickers.

The most unfortunate part is that when the party ends, it's none of "those people" who will suffer even in the slightest. I'm not even optimistic their egos will suffer, as Musk seems to show they are utterly immune even as their companies collapse under them.

ryandrake2mo ago

AI is already "an employee who can't say no to questionable assignments." We should all be reflective about the real value and inevitable consequences of this work.

1 more reply

palmotea2mo ago

> I'm vaguely surprised so many people who don't stand to benefit in any way shape or form, or who will even potentially starve if it works out, are so keen on it. But there are always bootlickers.

I've been getting more and more disappointed by software engineers (in aggregate) as the years go by. They don't even have to be bootlickers to do what you describe, I think a lot of it is pride in their "intelligence," which they express by believing and regurgitating the propaganda they've consumed. They prove their smarts by (among other things) having opinions that align with a zeitgeist of some group of powerful elites. They're too-easily manipulated.

And it's not just AI, it's also things like libertarianism. You've got workers identifying as capitalist tycoons, because they read a book and have some shares in a 401k.

1 more reply

wise_blood2mo ago

the ARC definition is the one I like the best, something like:

"it is AGI when we can no longer come up with tasks easy for humans to solve but hard for computers"

DiscourseFan2mo ago

We are very very far from that point

1 more reply

shepherdjerred2mo ago

They define AGI in their charter

> artificial general intelligence (AGI)—by which we mean highly autonomous systems that outperform humans at most economically valuable work

djoldman2mo ago

That definition is as I said: "something about which no conclusions can be drawn because the proposed definitions lack sufficient precision and completeness."

"Highly autonomous systems" and "most economically valuable work" aren't precise enough to be useful.

"Highly" implies that there is a continuum, so where does directed end and autonomy begin?

"Most economically valuable work"... each word in that has wiggle room, not to mention that any reasonable interpretation of it is a shifting goalpost as the work done by humans over history has shifted a great deal.

The point is that none of this is defined in a way so that people can agree that something has AGI/ASI/etc. or not. If people can't agree then there's no point in talking about it.

EDIT: interestingly, the OpenAI definition of AGI specifically means that a subset of humans do not have AGI.

2 more replies

pinkmuffinere2mo ago

This definition is not very precise though. For example, I think it can be argued from this definition that we had already reached AGI by the year 2010 (or earlier!). By 2010, computers were integrated into >50% economically valuable work, to the point that humans had mostly forgotten how to do them without computers. Drafting blueprints by hand was already a thing of the past, slide-rules were archaic, paper spreadsheets were long gone. You can debate whether these count as 'highly autonomous', but I don't think it's a clear slam-dunk either way. Not to mention dishwashers, textile weaving machines, CNC machines, assembly lines where >50% is automated, chemical/mineral refining operations, etc.

The definition reminds me of the common quip about robotics, "it's robotics when it doesn't work, once it works it's a machine".

maplethorpe2mo ago

In my experience, AGI always seemed to be the stand-in phrase for "human like" intelligence, after AI was co-opted to mean simpler things like markov-chain chat bots and state machines that control agent behaviour in video games.

If the definition has shifted once again to mean "a computer program that does a task pretty well for us", then what's the new term we're using to define human-level artificial intelligence?

catlifeonmars2mo ago

> economically valuable work

Is doing a ton of heavy lifting. What is considered economically valuable work is going to change from decade to decade, if not from year to year. What’s considered economically valuable also is going to be way different depending across individuals and nations within the exact same time frames too.

chrsw2mo ago

I take "outperform" to mean "can replace".

marcus_holmes2mo ago

y'see, I would not define a system as "highly autonomous" if it only responds to requests.

And I get that there are workarounds; effectively a cron job every second prompting "do the next thing".

But in my personal definition of "highly autonomous" it would not need prompting at all. It would be thinking all the time, independently of requests.

1 more reply

xmcqdpt22mo ago

is it most as an 50% of individual jobs? or able to produce 50% dollar for dollar?

what does "economically" means here? would it cover teaching? child care? healthcare? etc.

ozgung2mo ago

That’s the problem with the discussions on AI. No one defines the terms they use.

If we define AGI as an AI not doing a preset task but can be used for general purpose, then we already have that. If we define it as human level intelligence at _every_ task, then some humans fail to be an AGI. If we define AGI as a magic algorithm that does every task autonomously and successfully then that thing may not exist at all, even inside our brains.

When the AGI term was first coined they probably meant something like HAL 9000. We have that now (and HAL gaining self-awareness or refusing commands are just for dramatic effect and not necessary). Goalposts are not stable in this game.

VorpalWay2mo ago

It is not just AGI that is poorly defined. Plain AI is moving goalposts too. When the A* search algorithm was introduced in the late 60s, that was considered AI, when SVM (support vector machines) and KNN (K nearest neighbor) were new, they were AI. And so on.

These days it is neural networks and transformer models for language in particular that people mean when they say unqualified AI.

It is very hard to have a meaningful discussion when different parties mean different things with the same words.

dataflow2mo ago

I think the Turing test ought to be fine, but we need to be less generous to the AI when executing it. If there exists any human that can consistently tell your AI apart from humans without without insider knowledge, then I don't think you can claim to have AGI. Even if 99.9% of humans can't tell you apart.

So I'm very curious if any AI we have today would pass the Turing test under all circumstances, for example if: the examiner was allowed to continue as long as they wanted (even days/weeks), the examiner could be anybody (not just random selections of humans), observations other than the text itself were fair game (say, typing/response speed, exhaustion, time of day, the examiner themselves taking a break and asking to continue later), both subjects were allowed and expected to search on the internet, etc.

1 more reply

Wowfunhappy2mo ago

I really wish I could wave a magic wand and make everyone stop using the term "AI". It means everything and nothing. Say "machine learning" if that's what you mean.

2 more replies

marcus_holmes2mo ago

Agree. I talk about LLMs when discussing them, and avoid the term "AI" unless I'm talking about the entire industry as a whole. I find it really helps to be specific in this case.

dalmo32mo ago

> some humans fail to be an AGI

All humans fail to be AGI, by definition.

random32mo ago

I call these "romantic definitions" or "gesticulations". For private use (personal or even internal to teams) they can be great placeholders, assuming the goal is to refine vocabulary.

TacticalCoder2mo ago

That's no argument: the exact same can be said for what "AI" is: "Skippetyboop," "plipnikop," and "zingybang.".

djoldman2mo ago

> the exact same can be said for what "AI" is: "Skippetyboop," "plipnikop," and "zingybang.".

Yes.

choult2mo ago

The writing was on the wall as soon as it went all-in on commercializing the tech.

This will never happen, LLMs are already being used very unsafely, and if this HN headline stays where it is OpenAI will quietly remove their charter from their website.

d0able2mo ago

It's just lip service at this point.

sulam2mo ago

The reality is that current models are simply nowhere near AGI. Next token prediction has been pushed very far, and proven to have applicability far beyond the original domain it was designed for (reasoning models are an application I would not have predicted) but it is fundamentally not AGI. It has no real world model, no ability to learn in any but superficial ways, and without extensive scaffolding this is all very obvious when you use them.

onlyrealcuzzo2mo ago

How many months has it been since we were told there would be zero software engineers left in the world in 12 months?

SpicyLemonZest2mo ago

What Dario Amodei said 12 months ago is that AI would be "writing essentially all of the code", and the job of software engineers would become guiding and reviewing the code generation process. That's come true at a number of companies.

The important context I think people may miss is, this does not require AI to be 10x or 5x or even 1x as good as a human programmer. Claude is worse than me in meaningful ways at the kind of code I need to write, but it’s still doing almost all my coding because after 4.6 it’s smart enough to understand when I explain what program it should have written.

casey21mo ago

There are 0 relevant companies where AI is writing all the code. Now, you could mislead people by taking his words hyper literally. Computers = AI, basically all companies write code using computers these days. I don't know why you would, but you could.

A rent seeking mega corp that stopped writing real software over a decade ago you could literally force all your employees to use LLMs, but there is no value add for serious companies trying to solve real problems with programming.

xyzal2mo ago

>12

ACCount372mo ago

Given the mechanistic interpretability findings? I'm not sure how people still say shit like "no real world model" seriously.

famouswaffles2mo ago

People just overstate their understanding and knowledge, the usual human stuff. The same user has a comment in this thread that contains:

'If you actually know what models are doing under the hood to product output that...'

Any one that tells you they know 'what models are dong under the hood' simply has no idea what they're talking about, and it's amazing how common this is.

sulam2mo ago

Fair, I should define what I mean by under the hood. By “under the hood” I mean that models are still just being fed a stream of text (or other tokens in the case of video and audio models), being asked to predict the next token, and then doing that again. There is no technique that anyone has discovered that is different than that, at least not that is in production. If you think there is, and people are just keeping it secret, well, you clearly don’t know how these places work. The elaborations that make this more interesting than the original GPT/Attention stuff is 1) there is more than one model in the mix now, even though you may only be told you’re interacting with “GPT 5.4”, 2) there’s a significant amount of fine tuning with RLHF in specific domains that each lab feels is important to be good at because of benchmarks, strategy, or just conviction (DeepMind, we see you). There’s also a lot work being put into speeding up inference, as well as making it cheaper to operate. I probably shouldn’t forget tool use for that matter, since that’s the only reason they can count the r’s in strawberry these days.

None of that changes the concept that a model is just fundamentally very good at predicting what the next element in the stream should be, modulo injected randomness in the form of a temperature. Why does that actually end up looking like intelligence? Well, because we see the model’s ability to be plausibly correct over a wide range of topics and we get excited.

Btw, don’t take this reductionist approach as being synonymous with thinking these models aren’t incredibly useful and transformative for multiple industries. They’re a very big deal. But OpenAI shouldn’t give up because Opus 4.whatever is doing better on a bunch of benchmarks that are either saturated or in the training data, or have been RLHF’d to hell and back. This is not AGI.

5 more replies

sulam2mo ago

They have a _text_ model. There is some correlation between the text model and the world, but it’s loose and only because there’s a lot of text about the world. And of course robotics researchers are having to build world models, but these are far from general. If they had a real world model, I could tell them I want to play a game of chess and they would be able to remember where the pieces are from move to move.

ACCount372mo ago

What makes you think that text is inherently a worse reflection of the world than light is?

All world models are lossy as fuck, by the way. I could give you a list of chess moves and force you to recover the complete board state from it, and you wouldn't fare that much better than an off the shelf LLM would. An LLM trained for it would kick ass though.

3 more replies

10xDev2mo ago

People are finding it hard to grasp emergent properties can appear at very large scales and dimensions.

hintymad2mo ago

> It has no real world model, no ability to learn in any but superficial ways

I also think so, and in the meantime I have to admit a lot of people don't learn deeply either. Take math for example, how many STEM students from elite universities truly understood the definition of limit, let alone calculus beyond simple calculation? Or how many data scientists can really intuitively understand Bayesian statistics? Yet millions of them were doing their job in a kinda fine way with the help of the stackexchange family and now with the help of AI.

Spivak2mo ago

Well part of that is because STE folks aren't typically required to take any kind of theoretical maths. It's $Math for Engineers and it eschews theoretical underpinnings for application. I don't think it's any kind of failing, it's just different. My statistics class was a dense treatise in measure theory. Anyone who took the regular stats class is almost surely way better than me at designing an experiment, but I can talk your ear off about Lebesgue measure to basically zero practical end.

1 more reply

diabllicseagull2mo ago

> The impotence of naive idealism in the face of economic incentives.

I don't think it was so much the naivety of idealism, but more an adoption of idealism and related language to help market what was actually being built: a profit-first organization that's taking its true form little by little.

samrus2mo ago

I genuinely dont think it was profit first thenm i think it was all in good faith. Altman definitely got greedy and stabbed all that in the back

runarberg2mo ago

This is an extraordinary claim and it requires extraordinary evidence. There is nothing in Altman’s behavior current or past to suggest this was anything other then a money making grift. The easiest explanation for his betrayal is that he was simply lying.

conradkay2mo ago

I'm not fan of Altman but the financial angle doesn't make much sense when he doesn't have equity in OpenAI

There's some indirect exposure and potential of being granted significant equity, but his actions don't read as being for his own wealth

As an example, he walks away with nothing in the very plausible timeline where he was fired but not then reinstated

0xbadcafebee2mo ago

AGI isn't going to happen within the next 30 years so this is moot. The actual researchers have said so many times. It's only the business people and laypeople whooping about AGI always being imminent.

You cannot get real, actual AGI (the same ability to perform tasks as a human) without a continuous cycle of learning and deep memory, which LLMs cannot do. The best LLM "memory" is a search engine and document summarizer stuffed into a context window (which is like having someone take an entire physics course, writing down everything they learn on post-it notes, then you ask a different person a physics question, and that different person has to skim all the post-it notes, and then write a new post-it note to answer you). To learn it would need RL (which requires specific novel inputs) and retraining (so that it can retain and compute answers with the learned input). This would all take too much time and careful input/engineering along with novel techniques. So AGI is too expensive, time consuming, and difficult for us to achieve without radically different designs and a whole lot more effort.

Not only are LLMs not AGI, they're still not even that great at being LLMs. Sure, they can do a lot of cool things, like write working code and tests. But tell one "don't delete files in X/", and after a while, it will delete all the files in "X/", whereas a human would likely remember it's not supposed to delete some files, and go check first. It also does fun stuff like follow arbitrary instructions from an attacker found in random documents, which most humans also wouldn't do. If they had a real memory and RL in real-time, they wouldn't have these problems. But we're a long way away from that.

LLMs are fine. They aren't AGI.

stratos1232mo ago

> AGI isn't going to happen within the next 30 years so this is moot. The actual researchers have said so many times. It's only the business people and laypeople whooping about AGI always being imminent.

The statements of what "actual researchers" are you relying upon for your "next 30 years" estimate? How do you reconcile them with the sub-10- or even sub-5-years timelines of other AI researchers, like Daniel Kokotajlo[1] or Andrej Karpathy[2]? For that matter, what about polls of AI researchers, which usually obtain a median much shorter than 30 years [3]?

[1] https://x.com/DKokotajlo/status/1991564542103662729

[2] https://x.com/karpathy/status/1980669343479509025

[3] https://80000hours.org/2025/03/when-do-experts-expect-agi-to...

Spivak2mo ago

See this is a fun game because when you're fishing for a breakthrough you can predict tomorrow or 100 years. Nobody, even experts have any idea until it happens and they're holding it in their hands. To have any kind of accurate prediction you would have to have already observed other civilizations discover AGI to say how close the environment is to even be capable of making the leap. We could be missing something huge, we could need multiple seemingly unrelated breakthroughs to get there. We're for sure closer, but we could still be miles away, GPTs might even barking up the wrong tree.

Why this discussion is already annoying and poised to get so much worse is because now hundred billion dollar companies have a direct financial incentive to say they did it so I expect the definition will get softened to near meaninglessness so some marketing department can slap AGI on their thing.

MadxX792mo ago

I'm guessing they have a lot of shares in the AI companies they work(ed) for, and they would like to pump their value so they can buy an even nicer carribean island than they can already afford?

2 more replies

linkregister2mo ago

I think you are overindexing on the integer value given in the parent post, rather than seeing the essence that LLMs in their current form only excel on tasks they have been specifically trained for.

Karpathy himself has publicly stated that AGI itself is only possible with a new paradigm (that his group is working toward). He claims RHLF and attention models are near the end of their logarithmic curve. The concept of the "self-training AI" is likely impossible without a new kind of model.

We will likely see some classes of human skills completely taken over by LLMs this decade: call centers (already capable in 2026), SWE (the next couple years). Bear in mind the frontier labs have spend many billions on exhaustive training on every aspect of these domains. They are focusing training on the highest value occupations, but the long tail is huge.

It will be interesting to see if this investment will be obviated by a "real AGI" capable of learning without going through the capital-intensive training steps of current models.

stratos1232mo ago

Personally I'm not even sold on the current paradigm being too limited to produce AGI - there are still several OOMs worth of compute increase available, plus the algorithmic improvements have overall been accumulating faster than predicted.

But even assuming that a major breakthrough is required, it seems ludicrous to me to go from that to a timeline of a decade or more. This isn't like fusion power research, where you spend 10 years building a new installation only to find new problems. Software development is inherently faster, and AI research in particular has been moving extremely quickly in the past. (GPT-3 is only 6 years old.) I don't think a wall in AI progress, if one comes at all, will last more than a few years.

3 more replies

nerdsniper2mo ago

> which is like having someone take an entire physics course, writing down everything they learn on post-it notes, then you ask a different person a physics question, and that different person has to skim all the post-it notes, and then write a new post-it note to answer you

This is the best summary of an LLM that I’ve ever seen (for laypeople to “get it”) and is the first that accurately describes my experience. I will say, usually the notes passed to the second person are very impressive quality for the topic. But the “2nd person” still rarely has a deep understanding of it.

trollbridge2mo ago

A layperson analogy I use is that an LLM is like Dora with a really high IQ - it effectively needs everything reexplained to it, and you can’t give it more than a few seconds of context before it just forgets.

1 more reply

charcircuit2mo ago

LLMs are AGI because they offer intelligence on any subject.

>the same ability to perform tasks as a human

The first chess AIs lost to chess grandmasters. AI does not need to be better than humans to be considered AI.

>without a continuous cycle of learning and deep memory, which LLMs cannot do.

But harnesses like Claude Code can with how they can store and read files along with building tools to work with them.

>which is like having someone take an entire physics course, writing down everything they learn on post-it notes, then you ask a different person a physics question, and that different person has to skim all the post-it notes, and then write a new post-it note to answer you

This don't matter. You could say a chess AI is a bunch of different people who work together to explore distant paths of the search space. The idea you can split things into steps does not disqualify it from being AI.

>But tell one "don't delete files in X/", and after a while, it will delete all the files in "X/"

Humans make mistakes and mess up things too. LLMs are better at needle in a haystack tests than humans.

>It also does fun stuff like follow arbitrary instructions from an attacker

A ton of people get phished or social engineered by attackers. This is the number 1 way people get hacked. Do not underestimate people's willingness to follow instructions from strangers.

takwatanabe2mo ago

The post-it note analogy is good, but as a psychiatrist, I'd frame it differently: LLMs are essentially patients with anterograde amnesia.

They can reason brilliantly within a single conversation — just like an amnesic patient can hold an intelligent discussion — but the moment the session ends, everything is gone. No learning happened. No memory formed.

What's worse, even within a session, they degrade. Research shows that effective context utilization drops to <1% of the nominal window on some tasks (Paulsen 2025). Claude 3.5 Sonnet's 200K context has an effective window of ~4K on certain benchmarks. Du et al. (EMNLP 2025) found that context length alone causes 13-85% performance degradation — even when all irrelevant tokens are removed. Length itself is the poison.

This pattern is structurally identical to what I see in clinical practice every day. Anxiety fills working memory with background worry, hallucinations inject noise tokens, depressive rumination creates circular context that blocks updating. In every case, the treatment is the same: clear the context. Medication, sleep, or — for an LLM — a fresh session.

The industry keeps betting on bigger context windows, but that's expanding warehouse floor space while the desk stays the same size. The human brain solved this hundreds of millions of years ago: store everything in long-term memory, recall selectively when needed, consolidate during sleep, and actively forget what's no longer useful.

We can build the smartest single model in the world — the greatest genius humanity has ever seen — but a genius with no memory and no sleep is still just an amnesic savant. The ceiling isn't intelligence. It's architecture.

torginus2mo ago

I want to believe I'm reading an insightful comment from an actual human deeply familiar with both human congnition and how LLMs work, but this post is chock full of LLMisms

1 more reply

0xbadcafebee2mo ago

Yep. It's the guy from the movie "Memento" doing your physics homework on a couple pages of legal paper. When he runs out of paper, he has to write a post-it note summarizing it all, then burn the papers, and his memory resets. You can only do so much with that.

If we can crack long memory we're most of the way there. But you need RL in addition to long memory or the model doesn't improve. Part of the genius of humans is their adaptability. Show them how to make coffee with one coffee machine, they adapt to pretty much every other coffee machine; that's not just memory, that's RL. (Or a simpler example: crows are more capable of learning and acting with memory than an LLM is)

Currently the only way around both of these is brute-force (take in RL input from users/experiments, re-train the models constantly), and that's both very slow and error-prone (the flaws in models' thinking comes from lack of high-quality RL inputs). So without two major breakthoughs we're stuck tweaking what we got.

takwatanabe2mo ago

The coffee machine example is interesting. That's procedural memory in neuroscience. You don't memorize each machine. You abstract the steps. Grind, filter, add grounds, pour water. Then you adapt to any machine.

LLMs can't form procedural memory on their own. But you can build it outside the model. Store abstracted procedures, inject them when needed. That's closer to how the brain actually works than trying to retrain the model every time.

ACCount372mo ago

A lot of that seems to be the usual "you're training them wrong".

Sonnet 3.5 is old hat, and today's Sonnet 4.6 ships with an extra long 1M context window. And performs better on long context tasks while at it.

There are also attempts to address long context attention performance on the architectural side - streaming, learned KV dropout, differential attention. All of which can allow LLMs to sustain longer sessions and leverage longer contexts better.

If we're comparing to wet meat, then the closest thing humans have to context is working memory. Which humans also get a limited amount of - but can use to do complex work by loading things in and out of it. Which LLMs can also be trained to do. Today's tools like file search and context compression are crude versions of that.

1 more reply

herodoturtle2mo ago

Agree with your overall point. Curious what you’re basing the “not in the next 30 years” claim on, if you’d care to expand.

tim3332mo ago

I was going to say similar. Fair enough that you need ongoing learning and current LLMs don't cut it but not in the next 30 years seems dubious. The hardware seems adequate so what we need is some new software ideas and who knows how long that will take?

mirekrusin2mo ago

I think you're somehow right and wrong at the same.

All those "it's like ..." are faulty – "post-it notes" are not 3k pages of text that can be recalled instantly in one go, copied in fraction of a second to branch off, quickly rewritten, put into hierarchy describing virtually infinite amount of information (outside of 3k pages of text limit), generated on the fly in minutes on any topic pulling all information available from computer etc.

Poor man's RL on test time context (skills and friends) is something that shouldn't be discarded, we're at 1M tokens and growing and pogressive disclosure (without anything fancy, just bunch of markdowns in directories) means you can already stuff-in more information than human can remember during whole lifetime into always-on agents/swarms.

Currently latest models use more compute on RL than pre-training and this upward trend continues (from orders of magnitude smaller than pre-training to larger that pre-training). In that sense some form of continous RL is already happening, it's just quantified on new model releases, not realtime.

With LoRA and friends it's also already possible to do continuous training that directly affects weights, it's just that economy of it is not that great – you get much better value/cost ratio with above instead.

For some definitions of AGI it already happened ie. "someboy's computer use based work" even though "it can't actually flip burgers, can it?" is true, just not relevant.

ps. I should also mention that I don't believe in "programmers loosing jobs", on the contrary, we will have to ramp up on computational thinking large numbers of people and those who are already verse with it will keep reaping benefits – regardless if somebody agrees or not that AGI is already here, it arrives through computational doors speaking computational language first and imho this property will be here to stay as it's an expression of rationality etc

0xbadcafebee2mo ago

> you can already stuff-in more information than human can remember during whole lifetime

The human eye processes between 100GB and 800GB of data per day. We then continuously learn and adapt from this firehose of information, using short-term and long-term memory, which is continuously retrained and weighted. This isn't "book knowledge", but the same capability is needed to continuously learn and reason on a human-equivalent level. You'd need a supercomputer to attempt it, for a single human's learning and reasoning.

RL is used for SOTA models, but it's a constant game of catch-up with limited data and processing. It's like self-driving cars. How many millions of miles have they already captured? Yet they still fail at some basic driving tasks. It's because the cars can't learn or form long-term memories, much less process and act on the vast amount of data a human can in real time. Same for LLM. Training and tweaking gets you pretty far, but not matching humans.

> With LoRA and friends it's also already possible to do continuous training that directly affects weights, it's just that economy of it is not that great

And that means we're stuck with non-AGI. Which is fine! We could've had flying cars decades ago, but that was hard, expensive and unnecessary, so we didn't do that. There's not enough money in the global economy to "spend" our way to AGI in a short timeframe, even if we wanted to spend it all, even if we could build all the datacenters quickly enough, which we can't (despite being a huge nation, there are many limitations).

> For some definitions of AGI

Changing the goalposts is dangerous. A lot of scary real-world stuff is hung on the idea of AGI being here or not. People will keep getting more and more freaked out and acting out if we're not clear on what is really happening. We don't have AGI. We have useful LLMs and VLMs.

1 more reply

fwipsy2mo ago

Given how many "fundamental" limitations of AI have been resolved within the past few years, I'm skeptical. Even if you're right, I am not sure that the limitations you identified matter all that much in practice. I think very few human engineers are working on problems which are so novel and unique that AIs cannot grasp them without additional reinforcement learning.

> it will delete all the files in "X/"

How many "I deleted the prod database" stories have you seen? Humans do this too.

> follow arbitrary instructions from an attacker found in random documents

This is just the AI equivalent of phishing - inability to distinguish authorized from unauthorized requests.

Whenever people start criticizing AI, they always seem to conveniently leave out all the stupid crap humans do and compare AI against an idealized human instead.

torginus2mo ago

> How many "I deleted the prod database" stories have you seen?

If you've used the latest models extensively, you must've noticed times when AI 'runs out of common sense' and keeps trying stupid stuff.

I'm somewhat convinced that the amazing (and improving!) coding ability of these LLMs comes from it being RLHFd on the conversations its having with programmers, with each successfully resolved bug, implemented feature ending up in training data.

Thus we are involuntarily building the world's biggest stackoverflow.

Which for the record is incredibly useful, and may even put most programmers out of a job (who I think at that point should feel a bit stupid for letting this happen), but its not necessarily AGI.

kakacik2mo ago

Well, llms are way more stupid, doing things that even most juniors wouldn't do (and then you don't give PROD access to new junior hire, do you... most people are super careful with llms and simply don't trust them and don't let them anywhere near critical infra or data - thats seniority 101).

Which fundamental limitation do you mean? I haven't seen anything but slow, iterative improvements. Sure, if feels fine, turtle can eventually do 10,000 mile trek but just because its moving left and right feet and decreasing the distance doesn't mean its getting there anytime soon.

Parent mentioned way harder hurdles than iterative increments can tackle, rather radical new... everything.

fwip2mo ago

> How many "I deleted the prod database" stories have you seen? Humans do this too.

Humans generally do it on accident. They don't preface it with "Let me delete the production database," which LLMs do.

1 more reply

vor_2mo ago

> How many "I deleted the prod database" stories have you seen? Humans do this too.

Humans do it accidentally.

esafak2mo ago

It's going to be a sad day when AI starts messing with us deliberately.

sulam2mo ago

Sorry, but you're mistaking outputs with process. If you actually know what models are doing under the hood to product output that (admittedly) looks very convincing, you'll quickly realize that they are simply exceptionally good at statistically predicting the next token in a stream of tokens. The reason you are having to become an expert at context engineering, and the reason the labs still hire engineers, is because turning next token prediction into something that can simulate general intelligence isn't easy.

The boundaries of these systems is very easy to find, though. Try to play any kind of game with them that isn't a prediction game, or perhaps even some that are (try to play chess with an LLM, it's amusing).

2 more replies

surgical_fire2mo ago

> Given how many "fundamental" limitations of AI have been resolved within the past few years

Eh? Which limitations were solved?

2 more replies

orbital-decay2mo ago

>You cannot get real, actual AGI (the same ability to perform tasks as a human) without a continuous cycle of learning and deep memory, which LLMs cannot do

I disagree that this prerequisite is more necessary than e.g. having legs to move over the ground. But besides that, current LLMs are literally a result of the continuous cycle of learning and deep memory. It's pretty crude compared to what evolution and human process had to do, but that's precisely how the iterative model development cycle with the hierarchical bootstrap looks like. It's not fully autonomous though (engineer-driven/humans in the loop). Moreover, the distillation process you describe is precisely what "learning" is.

rishabhaiover2mo ago

I don't agree. The recent emergent behavior displayed by LLMs and test-time scaling (10x YoY revenue for Anthropic) is worth some hype. Of course, you are correct that most people who rally behind AGI do not understand the fundamental limitations of next-token prediction.

ACCount372mo ago

The "fundamental limitations" being what exactly?

rishabhaiover2mo ago

I used to think it was the quadratic complexity of attention but I guess that's not a concern anymore as they've made more hardware aware kernels of attention? The other I remember is continual learning but that may be solved in near-term future. I am not completely confident about it.

1 more reply

mattlondon2mo ago

I don't think humans learn any differently than post it notes TBH We call them text books though!

1 more reply

stavros2mo ago

> tell one "don't delete files in X/", and after a while, it will delete all the files in "X/", whereas a human would likely remember it's not supposed to delete some files, and go check first.

Have you seriously never had someone to go do something you told them not to do?

> It also does fun stuff like follow arbitrary instructions from an attacker found in random documents, which most humans also wouldn't do.

I guess my coworker didn't actually fall for that "hey this is your CEO, please change my password" WhatsApp message then, phew.

I've seen people move the goalposts on what it means for AI to be intelligent, but this is the first time I've seen someone move the goalposts on what it means for humans to be intelligent.

mattlondon2mo ago

I disagree. There is some argument to be had that they're already generally intelligent. They're already certainly better than me in basically anything I can ask them to do.

So that leads to the question of what qualifies as intelligent? And do we need sentience for intelligence? What about self-agency/-actuation? Is that needed for "generally intelligent"?

I don't know.

But I feel like we're not there yet, even for non-sentient intelligence. I personally think we need an "unlimited" context (as good as human memory context windows anyway, which some argue we've already surpassed) and genuine self-learning before we get close. I don't think we need it to be an infallible genius (i.e ASI) to qualify as generally intelligent ... or to put it another way "about as smart and reliable as the average human adult" which frankly is quite a low bar!

One thing for sure though, I think this will creep up on us and one day it will suddenly become apparent that it's already there and we just didn't appreciate/notice/comprehend. There won't be a big fireworks display the moment it happens, more of a creeping realisation I think.

I give it 5 years +/-2.

kovek2mo ago

Models need pre-training and fine tuning. Humans can do online learning.

1 more reply

swingboy2mo ago

Purely anecdotal, but GPT 5.4 has been better than Opus 4.6 this past week or so since it came out. It’s interesting to see it rank fairly low on that table. Opus “talks” better and produces nicer output (or, it renders better Markdown in OpenCode) than 5.4.

sigmoid102mo ago

Chatbot Arena is notoriously unreliable for several reasons. First it's (at least in theory) based on normal human feedback. Given by normal people's current voting trends, they clearly are not very good at identifying experts or at least remotely correct statements. Second, the leaderboards are gamed hard by the big companies. Even ARC AGI entered the actively gamed stage by now. Sure the current gen models are certainly better than the last and if two are vastly different in leaderboards there may be something fundamental to it, but there is hardly any reason to use these kinds of comparison tables for anything useful among the latest models.

EugeneOZ2mo ago

Not in my experience. Quoting my tweet:

Gave the same prompt to GPT 5.4 (high) and Opus 4.6 (high).

GPT 5.4 implemented the feature, refactored the code (was not asked to), removed comments that were not added in that session, made the code less readable, and introduced a bug. "Undo All".

Opus 4.6 correctly recognized that the feature is already implemented in the current code (yeah, lol) and proposed implementing tests and updating the docs.

Opus 4.6 is still the best coding agent.

So yeah, GPT 5.4 (high) didn't even check if the feature was already implemented.

Tried other tasks, tried "medium" reasoning - disappointment.

hirvi742mo ago

I make ChatGPT and Claude code review each other's outputs. ChatGPT thinks its solutions are better than what Claude produces. What was more surprising to me is that Claude, more often than not, prefers ChatGPT's responses too.

I am to sure one can really extrapolate much out of that, but I do find it interesting nonetheless.

I think language is also an important factor. I have a hard time deciding which of the two LLMs is worse at Swift, for example. They both seem equally great and awful in different ways.

stavros2mo ago

I do the same (I have both review a piece of code), and Codex tend to produce more nitpicky feedback. Opus usually agrees with it on around half the feedback, but says that the other half is too nitpicky to implement. I generally agree with Opus' assessment, and do agree that Codex nitpicks a lot.

I can't even use Codex for planning because it goes down deep design rabbit holes, whereas Opus is great at staying at the proper, high level.

frde2mo ago

Is this sample size of one task, or a consistent finding across many tasks?

falcor842mo ago

> “Automated AI research intern by Sep 2026, full AI researcher by Mar 2028”

Funny how timely this is, with Karpathy's Autoresearch hitting the top of HN yesterday (and this being an indication that frontier labs probably have much larger scale versions of this)

https://news.ycombinator.com/item?id=47291123

dataflow2mo ago

It's clever and funny, but nobody is legitimately near AGI, and their own AML Corp link proves Altman believes as much:

> Achieving AGI, he conceded, will require “a lot of medium-sized breakthroughs. I don’t think we need a big one.”

> At the Snowflake Summit in June 2025, Altman predicted that 2026 would mark a breakthrough when AI systems begin generating “novel insights” rather than simply recombining existing information. This represents a threshold he considers critical on the path to AGI.

Though I'm sure they'll try to change the charter before we get to that point, but yeah.

stavros2mo ago

We're not near AGI? Personally, I think we've passed it, given that LLMs are now generally more competent than the average person on average.

randyrand2mo ago

It’s artificial, it’s general, and all things considered it’s pretty damn intelligent.

abmmgb2mo ago

'.. if a value-aligned, safety-conscious project comes close to building AGI before we do, we commit to stop competing with and start assisting this project. '

Which such project is that, though? And would it accept OpenAI's assistance?

AGI, having access to our world, is precarious as alignment with humans is never guaranteed. Having a buffering medium, aka a simulation environment where AI operates might be a better in-between solution.

1 more reply

rishabhaiover2mo ago

> The impotence of naive idealism in the face of economic incentives

A great point. I saw blinding idealism during the early days of GPT era.

coliveira2mo ago

This was never idealism, it was much more about gaslighting. Billionaire investors have a playbook where they say something to gaslight people into agreeing with them, and then go to the opposite direction. It is already a pattern. For example, most companies would swear they were decided on cutting carbon emissions, just to forget everything when it comes to build data centers. They say some technology is just for the growth of mankind when in fact they're looking for monopoly and destroying jobs. They say they're donating money for "charity" when they're in fact investing in technologies that they have vested financial interest. They say we need to contribute to a not-for-profit institution which will later be used to create another monopoly. It is so predictable that I wonder why anyone can still be fooled by this.

CamperBob22mo ago

For example, most companies would swear they were decided on cutting carbon emissions, just to forget everything when it comes to build data centers.

This doesn't seem contradictory if you consider that success at AGI will solve the problem of carbon emissions, one way or another. If one data center ultimately replaces a whole medium-sized city of commuters...

bluefirebrand2mo ago

> If one data center ultimately replaces a whole medium-sized city of commuters...

Then we find out how long it takes for a medium sized city of commuters to start killing each other, elites and burning down data centers. Once they're hungry enough it'll happen for sure

4 more replies

dr_dshiv2mo ago

“The impotence of naive idealism in the face of economic incentives.”

“The changing goalposts of AGI and timelines. Notably, it’s common to now talk about ASI instead, implying we may have already achieved AGI, almost without noticing.”

Amen

mrcwinn2mo ago

Mission statements and blog posts are meaningless. Cap tables steer behavior and simultaneously protect interests. Stop forming unions or opining on Hacker News. We need to find a way to get citizens on the cap table in a meaningful way (and not at the very, very, very, very end of the waterfall underneath debt holders, hedge funds, governments, preferred investors). We are building this world for us. As it stands, don't fret about a robot taking your job: just make sure you own one of the robots.

trollbridge2mo ago

In other words: democracy.

mrcwinn2mo ago

Uh no, not at all. First of all, America is a Republic. Republics with capitalist economies express power through property ownership, not simply voting. I’m actually arguing ownership is more powerful than even a vote, though you’d certainly want both. You can tell this true by observing that a billionaire in America is more powerful and influential than a factory worker, even though they have the same vote in the democracy.

aleph_minus_one2mo ago

I disagree with the headline:

"Therefore, if a value-aligned, safety-conscious project comes close to building AGI before we do, we commit to stop competing with and start assisting this project."

I claim that currently no "value-aligned, safety-conscious project comes close to building AGI", both for the reasons

- "value-aligned, safety-conscious" and

- "close to building AGI".

So, based on this charter, OpenAI has no reason to surrender the race.

froggit2mo ago

"if a value-aligned, safety-conscious project comes close to building AGI before we do, we commit to stop competing with and start assisting this project"

The tried and true technique of speeding up completion of an established project by adding more people; well known for the tendency to fail successfully.

random32mo ago

These charters are as useful as new year resolutions.

deadbabe2mo ago

When AI can kill at scale, it means no person is too small or too insignificant to not be worth getting hunted down and killed, it will be cheap and easy. Before it used to be that only high value targets would be worth killing. But now even you, a nobody, will be killed. You say or do something the government doesn’t like, it’s over for you.

m3kw92mo ago

" if a value-aligned, safety-conscious project " and which project is that?

Are you sure Anthropic isn't aware of this and angling for this? And are you sure what Anthropic say is really value-aligned and safety concious? The PR bit surely is working right?

dmix2mo ago

This is taking Sam Altmans PR statements as proof of AGI?

Even the quote they used questions the premise of the article

> “We basically have built AGI” (later: “a spiritual statement, not a literal one”)

wongarsu2mo ago

AGI is so nebulous we will never be able to tell if we hit it. We have hit human-level abilities in some narrow tasks, and are still leagues away in others. And humans have so vastly different skill-levels that we can't even agree what human-level really means. As bad as the economic definition of AGI in OpenAI's Microsoft deal is, at least it's measurable.

Imho that's a big part of why people are shifting to ASI. Not because we reached AGI, but because 'we reached ASI' is a well-defined verifiable statement, where 'we reached AGI' just isn't

eloisant2mo ago

I feel like AI is actually helping us getting a better understanding of what "human intelligence" really is.

I remember when computers became better than humans at chess, many people were shocked and saw that as machines becoming more intelligent than humans. Because being good at chess what considered equivalent to "being smart".

sebastiennight2mo ago

> 'we reached ASI' is a well-defined verifiable statement, where 'we reached AGI' just isn't

So... we can't tell when the rocket has left Earth atmosphere, but we can tell when the rocket has entered space?

I'm not getting how "superior in all tasks" is better-defined for you than "equal in all tasks".

wongarsu2mo ago

Because ASI is 'superior to the best human' while AGI is 'matches or surpasses humans'. It's hard to agree on what 'matches humans' actually means. Would matching humans in coding involve the level of the average human (basically no coding ability), the level of a junior (the largest group of professionals), a senior ('someone competent') or of the best human we can find?

Or as the scene from 'I, Robot' goes: Will Smith asks the android: 'Can a robot write a symphony? Can a robot turn a blank canvas into a masterpiece?' and the android simply answers 'Can you?' ASI sidesteps that completely

1 more reply

hirvi742mo ago

> AGI is so nebulous we will never be able to tell if we hit it.

I completely agree. We can't even measure each other well, let alone machines.

Jensson2mo ago

It is very easy to tell if we still need humans in the loop. We still do so its not AGI.

2 more replies

croes2mo ago

> Therefore, if a value-aligned, safety-conscious project comes close to building AGI before we do

> It can be debated whether arena.ai is a suitable metric for AGI, a strong case can probably be made for why it’s not. However, that’s irrelevant, as the spirit of the self-sacrifice clause is to avoid an arms race, and we are clearly in one.

No, the spirit is clearly meant for near AGI and we aren’t near AGI

MichaelDickens2mo ago

Altman has personally claimed that we are close to AGI. Therefore, according to him, OpenAI should invoke the self-sacrifice clause.

croes2mo ago

Of course he claims that, he seeks money from investors but the charter is likely be written by people who took it seriously

bigyabai2mo ago

AFAIK we're still working on a unified definition and testing theory for whatever "AGI" is.

rvz2mo ago

The "I" in AGI stands for IPO.

The "S" stands for Safety.

p-o2mo ago

I think the brunt of the disruption regarding AI is already behind us for LLMs at least. It's possible we'll see improvements over the following months/years, but government will inevitably start to catchup to the level of disinformation and confusion that AI has brought to this world.

Laws & regulations that needs to be created to reign in AI will undoubtedly increase the opportunity cost of training LLMs.

For some, it might be similar to the early 2000s, but I think it's just a healthy rebalance of what AI is, and how the society needs to implement this new, hardly controllable, paradigm. With this perspective, OpenAI has a lot to lose as it hasn't been able to create a moat for itself compared to, let's say, Anthropic.

eloisant2mo ago

I think that even if the models were to plateau today, there are still a lot of room for improvement in all the tooling around them, people finding ideas of applications, and users getting used of them. So we're not done with the disruption.

Some of the apps made possible by smartphones only appeared a decade after they were made technically possible. A lot of the new use cases made possible by the Internet and broadband connections only became widely used because of Covid.

I was already using Skype 20 years ago to make video calls, but I've only seen PTA meetings over Zoom since Covid.

p-o2mo ago

Yes, I think you're right that it is not the end of the road for LLM and the application of LLM might be adopted over time across a variety of industries.

I guess what I failed to convey in my original comment was that, like the Internet 20 years ago, the current advancement made by AI might stall at a foundational level, while the landscape evolves.

Essentially, I believe what you're saying is really close in spirit to what I'm saying.

bilekas2mo ago

Hah can you imagine a world where OpenAi says to all the people who have dumped billions in : "well we lost guys, sorry about that, were just gonna help Google now".

I'll eat my hat after I sell you a bridge.

skerit2mo ago

Gemini might be great at benchmarks, it is terrible at actual agentic coding. So Anthropic seems like a more logical choice.

bilekas2mo ago

The particulars don't matter. OpenAI will never do this.

tsunamifury2mo ago

Words are Meaningless in the real world. It’s amazing that no one here gets that

tim3332mo ago

Not sure about that one. I think you are over generalizing from sometimes don't mean much to always.

tsunamifury2mo ago

haha

measurablefunc2mo ago

Everyone here sits in front of the computer & types words all day so it's kinda like the guy whose salary depends on not understanding what he is paid to not understand. Telling programmers that no amount of arithmetic will add up to anything except a bunch of numbers is a waste of time & energy.

spprashant2mo ago

sidenote: Those Grok rankings in arena dot ai don't make sense. The avg rank for grok 420 seems to be ~10 but the overall rank puts it at 4 right behind opus and gemini.

Muhammad5232mo ago

Two days from now and ClosedAI will remove their charter...

ambicapter2mo ago

Why the title change?

previous title: Based on its own charter, OpenAI should surrender the race

dang2mo ago

Because it's linkbait. From https://news.ycombinator.com/newsguidelines.html: "Please use the original title, unless it is misleading or linkbait."

(The article itself strikes me as better than that)

dwohnitmok2mo ago

Really? I view the original title as a very good summary of the overall point of the article and this new title as fairly misleading.

> Therefore, one can only conclude, that we currently meet the stated example triggering condition of “a better-than-even chance of success in the next two years”. As per its charter, OpenAI should stop competing with the likes of Anthropic and Gemini, and join forces, however that might look like.

The new title is a single, almost throwaway, line from the article.

> While this will never happen, I think it’s illustrative of some great points for pondering:

> The impotence of naive idealism in the face of economic incentives. The discrepancy between marketing points and practical actions. The changing goalposts of AGI and timelines. Notably, it’s common to now talk about ASI instead, implying we may have already achieved AGI, almost without noticing.

enraged_camel2mo ago

Yeah, that was weird. The current title editorializes. @dang can you revert to actual title please?

kergonath2mo ago

> @dang can you revert to actual title please?

This does not work. From the guidelines:

> Please don't post on HN to ask or tell us something. Send it to hn@ycombinator.com.

https://news.ycombinator.com/newsguidelines.html

tokai2mo ago

Because if mods feels like a title is flamebaiting it will get changed. It rarely makes much sense.

dang2mo ago

Interpretations will always differ, of course, but an equally potent factor is that people tend only to notice the cases they disagree with. The edits that are fine pass without notice because they don't stand out, which is kind of the point of debaiting titles in the first place.

bluegatty2mo ago

AI will be used wherever computers, silicon, RAM, software, GPUs and robots are today.

And that's it.

Everything beyond that is nuance.

Nuance matters, but it's not the real story, it's the side show.

citizenkeen2mo ago

Why was the submission headline changed?

throwaw122mo ago

OpenAI:

- we are building Open AI - only if you have more than $10B net worth

- we are against using AI for military purposes - except when that case is allowed by government

- we are on a mission to help humanity - again, we define humanity as set of people with more than $10B net worth

- surrender? - sure, sure, we will, only to people with more than $10B net worth, they can do whatever they want to our models, we will surrender to them

wrsh072mo ago

You've attached to the $10b number - is that illustrative or is there something specific you're referring to?

reppap2mo ago

Sam Altman is just another Elon Musk. Saying whatever he thinks sounds good in the moment.

kakacik2mo ago

Its much, much broader. Almost all of them up there are hardcore sociopaths. When they look at you, they see a tool and source of revenue, not a human being. To be used, and then thrown away. Musk, Bezos, Gates, Schmidt, Altman, Thiel and so on and on. Most career politicians are similar.

Is this really the garbage that should lead humanity to our future? Because inevitably that will be a dark future for 99.99% of the humans. And no, you and you won't be part of that 0.01% or whatever tiny number of elite think they are better than rest of us.

dkwmdkfkdk2mo ago

Anyone can make these silly negations. As if any other big corp is different. And as if your own imaginary big corp (if you took the time and effort to try to achieve, that is) would have behaved differently given enough pressure from investors and share holders.

This is all just very naive

PunchyHamster2mo ago

> Anyone can make these silly negations. As if any other big corp is different.

The point you're desperately trying to miss is that most other companies don't put up those moral claims in the first place

tombert2mo ago

Sorry, no, you shouldn't just handwave away bad behavior just because it's common. That's ridiculous.

If big corporations do things that are unethical, they should be called out, even if they're common. Saying "well everyone's doing it", isn't a good excuse to do things that are unethical.

It's not "naive" to point out the lies that OpenAI told to get to the point that they are now. They were claiming to be a non-profit for awhile, they grew in popularity based in part on that early good-will, and now they are a for-profit company looking to IPO into one of the most valuable corporations on the planet. That's a weird thing. That's a thing that seems to be kind of antithetical to their initial purpose. People should point that out.

jimmydoe2mo ago

Time to get rid of charter and be a normal member of this capitalism :)

sreekanth8502mo ago

He is the most terrible ceo among all of them.

tombert2mo ago

Oh I don't know about that. I hate Altman, but I find Larry Ellison to be a special kind of evil, for example.

sreekanth8502mo ago

You can be terrible but not evil.

dangus2mo ago

We should be starting these discussions pointing out that Sam Altman is a serial liar.

HeavyStorm2mo ago

Words on a piece of paper mean absolutely nothing. What matters the most is the real intent of the leaders of the company (something that changes over time and that changes over time, that is, what matters to then and who they are). Sam Altman clearly isn't a man of deep principles regarding humanity and ethics. He seems to regard his legacy, OAI impact and money above everything else. Some of the rest of the leadership do seem to think differently, but I also believe they no longer have the social and political capital to stop Sam.

kirubakaran2mo ago

""" I resigned from OpenAI. I care deeply about the Robotics team and the work we built together. This wasn’t an easy call. AI has an important role in national security. But surveillance of Americans without judicial oversight and lethal autonomy without human authorization are lines that deserved more deliberation than they got. This was about principle, not people. I have deep respect for Sam and the team, and I’m proud of what we built together. """

- Caitlin Kalinowski, previously head of robotics at OpenAI

https://www.linkedin.com/posts/ckalinowski_i-resigned-from-o...

sheepscreek2mo ago

I 100% understand and agree the AI community argument around lethal autonomy.

But I am trying to understand this from the perspective of defence & govt. Why is it so business as usual for them? Do they consider this at par with missiles with infra-red/heat sensors for tracking/locking? Where does the definition of lethal autonomy begin and end?

Just putting this out there as a point to ponder on. By itself, this may rightly be too broad and should be debated.

BoxFour2mo ago

I don't think it's about lethal autonomy specifically as much as it's just about government autonomy period. They don’t think private companies should have any veto power over how the government uses some technology they're provided.

On its face that’s not a crazy stance: Governments are meant to represent the public, while private companies obviously aren't. I think it’s somewhat understandable why the government might reject that kind of "we know better than you" type of clause.

Of course, the reaction is wildly out of proportion. A normal response would just be to stop doing business with the company and move on. Labeling them a supply chain risk is an extreme response.

dmschulman2mo ago

Additionally that kind of public trust only works if you have a government operating under the constraints of a legal framework, and to a lesser extent, an ethical framework. When a government serves the whims of an individual and instead of the function of their office, shirking agreed upon laws, etc, then you no longer have a government serving the people.

1 more reply

remarkEon2mo ago

Agree, and I think the labeling of them (Anthropic) a supply chain risk was handled poorly and will likely be reverted over time. That being said, I would be nervous if I was in the Pentagon and depended on Anthropic tooling for something, even if that something was unrelated to kinetic operations. How do they audit that Anthropic can't alter model outputs for contexts they (the ethics board or whatever it's called, can't remember) don't like? If you sell a weapon to the department that is in charge of killing people and breaking things, you don't get a say in who gets killed or how. It's never worked like that.

Maybe the argument is that they should, but I don't agree with that. If Anthropic or any of these other vendors have reservations about the logical conclusion of how these tools will be/are used then they should not sell to the government. Simple as. However ... if the claims Anthropic et al make about how these systems will develop and the capabilities they will have are at all true, then the government will come knocking anyway.

4 more replies

crote2mo ago

> They don’t think private companies should have any veto power over how the government uses some technology they're provided.

On the other hand, why should the government have infinite power to override how a business operates? If you're not able to refuse to sell to the government, isn't that basically forced speech and/or forced labor?

1 more reply

mlinhares2mo ago

It’s not obvious that the government should have to power to overwrite this, the US constitution was written as a collection of negative rights exactly to rein in government dictatorial impulses.

And now that we see the government blatantly disrespecting the constitution and the rule of law the civil community must react.

1 more reply

bigyabai2mo ago

For surveillance at least, multimodal AI is old hat: https://en.wikipedia.org/wiki/Sentient_(intelligence_analysi...

If you're one of the contractors working in NRO or aware of Sentient, OpenAI and Anthropic probably do look like supply chain risks. They want to subsume the work you're already doing with more extreme limitations (ones that might already be violated). So now you're pitching backup service providers, analyzing the cost of on-prem, and pricing out your own model training; it would be really convenient if OpenAI just agreed to terms. As a contractor, you can make them an offer so good that it would be career suicide to refuse it.

Autonomous weapons are a horse of a different color, but it's safe to assume the same discussions are happening inside Anduril et. al.

nradov2mo ago

The military has deployed lethal autonomous weapons since at least 1979. LLMs might be useful for certain missions but from a military perspective they're nothing fundamentally new.

https://www.vp4association.com/aircraft-information-2/32-2/m...

lich_king2mo ago

My most straightforward read is that the military simply doesn't want their contractors to have a say in the war doctrine. Raytheon doesn't get to say "you can only bomb the countries we like, and no hitting hospitals or schools". It doesn't necessarily mean the Pentagon wants to bomb hospitals, but they also don't want to lose autonomy.

A less charitable interpretation is that the current doctrine is "China / Russia will build autonomous killbots, so we can't allow a killbot gap".

I'm frankly less concerned about "proper" military uses than I am about the tech bleeding into the sphere of domestic law enforcement, as it inevitably will.

1 more reply

marcosdumay2mo ago

> But I am trying to understand this from the perspective of defence & govt.

Hum...

The one thing domestic surveillance enables is defining targets inside the country, and the one thing lethal autonomy enables is executing targets that a soldier would refuse to.

Those things don't have other uses.

ronnier2mo ago

Do Chinese do this in China? Walk away from companies that will be used for war? I doesn’t seem to be prevalent and instead they try to take every advantage they can to push their country, China, to become the most dominate in the world. They must be elated to watch the world’s premier tech companies protest the American government and refusal to work with them. If I wanted China to be weaker I’d hope that Chinese companies protested and refused to work with the Chinese government.

fancy_pantser2mo ago

It's explicitly illegal in China.

A 2017 national intelligence law compels Chinese companies and individuals to cooperate with state intelligence when asked and without and public notice.

China has no equivalent of the whistleblower protection that enables resignations with public letters explaining why, protests, open letters with many signatures, etc. Whenever you see "Chinese whistleblower" in the news, you're looking at someone who quietly fled the country first and then blew the whistle. Example: https://www.cnn.com/2026/02/27/us/china-nyc-whistleblower-uf...

1 more reply

yorwba2mo ago

Yes, of course there are people in China who, when their job puts them in conflict with their ethics, will decide to do something ethical. I can't think of any war-related examples, since it's been a while since China was involved in any big wars, but I like the story of Liu Lipeng, who used to work as an internet censor: https://madeinchinajournal.com/2025/04/03/me-and-my-censor/

tkz13122mo ago

You have used chatgpt presumably. Based on your interactions with it, do you seriously think it should be allowed to shoot a gun without any human oversight?

1 more reply

pfortuny2mo ago

One of the things about slave coups in ancient times was that they really believed there are things more important than life.

user39393822mo ago

Yes we should dispense with ethics so we can win at all costs. Like your point isn’t invalid but what’s the point of restating something akin to the trolley problem but this time, as if the answer is obvious.

qwerpy2mo ago

We can debate philosophy while our adversaries use any means at their disposal. Or we can invest in different ideas, see what works, and choose the best option.

1 more reply

hirvi742mo ago

> This was about principle, not people.

Why do I not believe this at all? Were things truly sunshine and roses at OpenAI up until this Pentagon debacle? Perhaps I am mistaken, but it seemed like the writing was on the wall years ago.

> I have deep respect for Sam and the team

I have even more questions now.

daheza2mo ago

Sounds like a statement to ensure they aren’t blacklisted or seen as anti executive.

hirvi742mo ago

> they aren’t blacklisted or seen as anti executive.

Which further solidifies my belief that this person is being disingenuous.

lucianbr2mo ago

How can you respect someone who betrays a principle you care about for money?

Not to mention that the principles are not being betrayed now for the first time.

2 more replies

labrador2mo ago

The way Sam Altman bungled the Pentagon deal by swooping in a few hours after Anthropic was fired should be grounds for OpenAI finding another CEO.

micromacrofoot2mo ago

it started before that, the openai president donated 20mil to trump the month prior... ellision and kushners also pretty heavily involved with openai and altman is tight with peter thiel

the whole public debacle was planned, the tos isn't stopping the pentagon from doing anything (as we seen with openai now)

bigyabai2mo ago

What, you think the DoD can only designate one supply chain risk at a time?

tim3332mo ago

Swooping in a few hours before Anthropic was fired... Yeah bad look.

fiatpandas2mo ago

Nytimes reported that DoW prepped that deal in the background in parallel, as backup for issues with Anthopic. But I agree, optics are terrible.

mirsadm2mo ago

Why? Give it a couple weeks and everybody will forget about this. They'll be earning more money than previously. Job well done.

dgroshev2mo ago

Just like everyone forgot about this https://www.wired.com/story/openai-staff-walk-protest-sam-al...

wongarsu2mo ago

Employees are the ones with the real power to make this hurt. The customers switching over are easily offset by the DoD contract. But losing talent over this, and having a harder time to attract future talent? That could hurt them

Sam probably expects to solve this by just offering more money. It worked in the past

2 more replies

PunchyHamster2mo ago

Traitors to humanity

pluc2mo ago

[flagged]

dang2mo ago

Please don't post nationalistic flamebait, regardless of nation. It's not what this site is for, and destroys what it is for.

https://news.ycombinator.com/newsguidelines.html

fHr2mo ago

>uses arena ranking only

>claims to be some topshot data scientist

okay

ozgung2mo ago

"Artificial general intelligence (AGI) is a type of artificial intelligence that matches or surpasses human capabilities across virtually all cognitive tasks." [Wikipedia]

One can argue that they have already achieved this. At least for short termed tasks. Humans are still better at organization, collaboration and carrying out very long tasks like managing a project or a company.

A_D_E_P_T2mo ago

> One can argue that they have already achieved this.

No, because they're hugely reliant on their training data and can't really move beyond their training data. This is why you haven't seen an explosion of new LLM-aided scientific discoveries, why Suno can't write a song in a new genre (even if you explain it to Suno in detail and give it actual examples,) etc.

This should tell you something enormous about (1) their future potential and (2) how their "intelligence" is rooted in essentially baseline human communications.

Admittedly LLMs are superhuman in the performance of tasks which are, for want of a better term, "conventional" -- and which are well-represented in their training data.

nradov2mo ago

Sam Altman keeps claiming that ChatGPT is going to cure cancer. So far its contribution to novel medical research has been approximately zero.

matricks2mo ago

> can’t really move beyond their training data

I don’t even think humans can “move beyond” their sensory data. They generalize using it, which is amazing, but they are still limited by it.* So why is this a reasonable standard for non-biological intelligence?

We have compelling evidence that both can learn in unsupervised settings. (I grant one has to wrap a transformer model with a training harness, but how can anyone sincerely consider this as a disqualifier while admitting that an infant cannot raise itself from birth!)

I’m happy to discuss nuance like different architectures (carbon versus silicon, neurons versus ANNs, etc), but the human tendency to move the goalposts is not something to be proud of. We really need to stop doing this.

* Jeff Hawkins describes the brain as relentlessly searching for invariants from its sensory data. It finds patterns in them and generalizes.

1 more reply

rdiddly2mo ago

You can't say someone has achieved artificial general intelligence for some specific subset of tasks or parameters; it's a contradiction.

matricks2mo ago

It depends.

SoTA models are at least very close to AGI when it comes to textual and still image inputs for most domains. In many domains, SoTA AI is superhuman both in time and speed. (Not wrt energy efficiency.*)

AI SoTA for video is not at AGI level, clearly.

Many people distinguish intelligence from memory. With this in mind, I think one can argue we’ve reached AGI in terms of “intelligence”; we just haven’t paired it up with enough memory yet.

* Humans have a really compelling advantage in terms of efficiency; brains need something like 20W. But AGI as a threshold has nothing directly to do with power efficiency, does it?

aerhardt2mo ago

LLMS are terrible at writing in terms of style, and in terms of content or creativity they couldn’t come up with a short story any better than what you’d find at an amateur writer workshop. To declare we have reached AGI in textual media seems premature at best.

tim3332mo ago

I think the term "artificial general intelligence" is deliberately ambiguous as it doesn't specify any levels. I mean my cat was generally intelligent.

LLMs can't be swapped in for human workers in general because there are still a lot of things they don't do like learning as they go. So that's missing from the Wikipedia thing.

j / k navigate · click thread line to collapse