undefined | Better HN

0 pointsbenjaminl1mo ago0 comments

This issue here is that people have different definitions of AGI. From the description. Getting 100% on this benchmark would be more than AGI and would qualify for ASI (Algorithmic Super Intelligence) not just AGI.

0 comments

throwuxiytayq1mo ago

People are still debating whether these models exhibit any kind of intelligence and any kind of thinking. Setting the bar higher then necessary is welcome, but at this point I’m pretty sure everyone’s opinions are set in stone.

fc417fc8021mo ago

If you only outdo humans 50% of the time you're never going to get consensus on if you've qualified. Whereas outdoing 90% of humans on 90% of all the most difficult tasks we could come up with is going to be difficult to argue against.

This benchmark is only one such task. After this one there's still the rest of that 90% to go.

Beating humans isn't anywhere near sufficient to qualify as ASI. That's an entirely different league with criteria that are even more vague.

nearbuy1mo ago

Even dumb humans are considered to have general intelligence. If the bar is having to outdo the median human, then 50% of humans don't have general intelligence.

fc417fc8021mo ago

Not true. We don't have a good definition for intelligence - it's very much an I'll know it when I see it sort of thing.

Frontier models are reliably providing high undergraduate to low graduate level customized explanations of highly technical topics at this point. Yet I regularly catch them making errors that a human never would and which betray a fatal lack of any sort of mental model. What are we supposed to make of that?

It's an exceedingly weird situation we find ourselves in. These models can provide useful assistance to literal mathematicians yet simultaneously show clear evidence of lacking some sort of reasoning the details of which I find difficult to articulate. They also can't learn on the job whatsoever. Is that intelligence? Probably. But is it general? I don't think so, at least not in the sense that "AGI" implies to me.

Once humanity runs out of examples that reliably trip them up I'll agree that they're "general" to the same extent that humans are regardless of if we've figured out the secrets behind things such as cohesive world models, self awareness, active learning during operation, and theory of mind.

LordDragonfang1mo ago

> Yet I regularly catch them making errors that a human never would

I have yet to see a "error" that modern frontier models make that I could not imagine a human making - average humans are way more error prone than the kind of person who posts here thinks, because the social sorting effects of intelligence are so strong you almost never actually interact with people more than a half standard deviation away. (The one exception is errors in spatial reasoning with things humans are intimately familiar with - for example, clothing - because LLMs live in literary space, not physics space, and only know about these things secondhand)

> and which betray a fatal lack of any sort of mental model.

This has not been a remotely credible claim for at least the past six months, and it seemed obviously untrue for probably a year before then. They clearly do have a mental model of things, it's just not one that maps cleanly to the model of a human who lives in 3D space. In fact, their model of how humans interact is so good that you forget that you're talking to something that has to infer rather than intuit how the physical world works, and then attribute failures of that model to not having one.

2 more replies

charcircuit1mo ago

I think you are getting caught up on the intelligence part. That is the easy part since AGI doesn't have to be intelligent, it just has to be intelligence. If you look at early chess AI you will see that they are very weak compared to even a beginner human. The level of intelligence does not matter for a chess bot to be considered AI. It is that it is emulating intelligence that makes it AI.

>But is it general? I don't think so

I would consider it as general due to me being able to take any problem I can think of and the AI will make an attempt to solve it. Actually solving it is not a requirement for AGI. Being able to solve it just makes it smarter than an AGI that can't. You can trip up chess AI, but that don't stop them from being AI. So why apply that standard to AGI?

1 more reply

nearbuy1mo ago

> Not true.

It's certainly true. By definition. If the bar for general intelligence is being smarter than the median human, 50% of people won't reach the threshold for general intelligence. (And if the bar is beating the median in every cognitive test, then a much smaller fraction of people would qualify.)

People don't have a consistent definition of AGI, and the definitions have changed over the past couple years, but I think most people have settled on it meaning at least as smart as humans in every cognitive area. But that has to be compared to dumb people, not median. We don't want to say that regular people don't have general intelligence.

1 more reply

foltik1mo ago

I’d be hesitant to call that ASI if it’s pretty obvious how you’d write a regular old program to solve it.

nopinsight1mo ago

It’s not that simple since each problem is supposed to be distinct and different enough that no single program can solve multiple of them properly. No problem spec is provided as well iiuc so you can’t simply ask an LLM to generate code without doing other things.

fc417fc8021mo ago

A human can sit down to play a game with unknown rules and write a spec as he goes. If a model can't even figure out to attempt that, let alone succeed at it, then it most certainly isn't an example of "general" intelligence.

LordDragonfang1mo ago

> A human can sit down to play a game with unknown rules and write a spec as he goes.

Some humans can. Many, if not most humans cannot. A significant enough fraction of humans have trouble putting together Ikea furniture that there are memes about its difficulty. You're vastly overestimating the capabilities of the average human. Working in tech puts you in probably the top ~1-5% of capability to intuit and understand rules, but it distorts your intuition of what a "reasonable" baseline for that is.

1 more reply

cubefox1mo ago

It's not obvious at all. And I would say pretty much impossible without using machine learning. Even for ARC-AGI-1 there is no GOFAI program that scores high.

LordDragonfang1mo ago

In retrospect, it seems obvious that we hit AGI by a reasonable "at least as intelligent as some humans" definition when o3 came out, and everything since then has been goalpost moving by people who have higher and higher bars for which percentile human they would be willing to employ (or consider intellectually capable). People should really just use the term "ASI" when their definition of AGI excludes the majority of humans.

Edit: Here's the guy who coined the term saying we're already there. Everything else is arguing over definitions.

https://x.com/mgubrud/status/2036262415634153624

> Well, Lars, I INVENTED THE TERM and I say we have achieved AGI. Current models perform at roughly high-human level in command of language and general knowledge, but work thousands of times faster than us. Still some major deficiencies remain but they're falling fast.

iLoveOncall1mo ago

There's a single true definition of AGI, open the page about AGI on Wikipedia but using archive.org on a snapshot from 10 years ago.

All the rest is bullshit made up by LLM labs to make it seem like they hit AGI by dumbing down its definition.

https://web.archive.org/web/20150108000749/https://en.wikipe...

j / k navigate · click thread line to collapse

0 comments

throwuxiytayq1mo ago

fc417fc8021mo ago

This benchmark is only one such task. After this one there's still the rest of that 90% to go.

Beating humans isn't anywhere near sufficient to qualify as ASI. That's an entirely different league with criteria that are even more vague.

nearbuy1mo ago

Even dumb humans are considered to have general intelligence. If the bar is having to outdo the median human, then 50% of humans don't have general intelligence.

fc417fc8021mo ago

Not true. We don't have a good definition for intelligence - it's very much an I'll know it when I see it sort of thing.

LordDragonfang1mo ago

> Yet I regularly catch them making errors that a human never would

> and which betray a fatal lack of any sort of mental model.

2 more replies

charcircuit1mo ago

>But is it general? I don't think so

1 more reply

nearbuy1mo ago

> Not true.

1 more reply

foltik1mo ago

I’d be hesitant to call that ASI if it’s pretty obvious how you’d write a regular old program to solve it.

nopinsight1mo ago

fc417fc8021mo ago

LordDragonfang1mo ago

> A human can sit down to play a game with unknown rules and write a spec as he goes.

1 more reply

cubefox1mo ago

It's not obvious at all. And I would say pretty much impossible without using machine learning. Even for ARC-AGI-1 there is no GOFAI program that scores high.

LordDragonfang1mo ago

Edit: Here's the guy who coined the term saying we're already there. Everything else is arguing over definitions.

https://x.com/mgubrud/status/2036262415634153624

iLoveOncall1mo ago

There's a single true definition of AGI, open the page about AGI on Wikipedia but using archive.org on a snapshot from 10 years ago.

All the rest is bullshit made up by LLM labs to make it seem like they hit AGI by dumbing down its definition.

https://web.archive.org/web/20150108000749/https://en.wikipe...

j / k navigate · click thread line to collapse