What LLMs have done is really redefine my internal definition of "intelligence."
Putting aside the fact I don't believe in free will, I'm no longer sure my own brain is doing anything substantially different to what an LLM does now. Even with tasks like math I wonder if my brain is not really "working out" the solution but merely using probabilities based on every previous math problem I have seen or solved.
Also if you ask them a question they can provide you one answer with very little thinking, and then if that’s not good enough they can devote more time to thinking about the answer before they answer again. They can devote arbitrary levels of thinking to any problem depending on what is needed. They can continuously take in new data and continually update their world view throughout their entire existence based on this new information.
There’s actually a huge list of things current autoregressive approaches to AI cannot do, but they can be hard to describe and people don’t like to talk about them so many people actually don’t understand how limited the current systems are.
Here’s a great video where Yann Lecun talks about the limits of autoregressive approaches to AI with many examples:
More specifically, something like “whats the best brand of phone”. The LLM just summarizes common knowledge. But even a child will grasp some of the differences and have opinions drawn from experience.
Note that this isn’t just an anthro-good argument. AI systems could have experiences and be trained on long duration tasks with memory of what worked and why.
Learn something without a megawatt hour of power.
Read a novel and talk about what it really means.
What's the point of this restriction? It really just presupposes the limitation of LLM, so that any negative points would look moot.
EDIT: Also, I tried to discuss this very specific point w/ GPT, but it didn't really "get" it. 15-year old kids would be able to follow through.
For reasoning you can write out the logic of your reasons, so there's that. But that's absolutely not required for AGI. People can already go a long way (often further than by reasoning) on intuition alone without being able to explain how they reached their conclusions.
I work at a company with ~50k employees each of whom has different data access rules governed by regulation.
So either (a) you train thousands of models which is cost-prohibitive or (b) it is going to be trained on what is effectively public company data i.e. making the agent pretty useless.
Never really seen how this situation gets resolved.
My question is whether this capability even exists.
And if it does how robust it is to workarounds.
And fine grained access control is a foundational data governance issue for every enterprise.
More generally, the author doesn’t operationalize any of their terms or get out of the weeds of their argument. What constitutes AGI? Even if LLMs do continue to improve at the current rate (as measured by some synthetic benchmark), why do we assume that said improvement will be what’s needed to bridge the gap between the capabilities of current LLMs and AGI?
More generally, how do we even define or recognize general intelligence or consciousness? And if we recognize intelligence or consciousness does that come with legal rights and protections equal to what we offer people today?
Perhaps this is a term of art in harder science or maths. I can't help but think here it's likely to confuse the majority as they wonder why the author is conflating memory and compute.
Something that might help is for the link to be amended to link to the page as a whole (and the unconventional expansion of OOM at the top) rather than the #Compute anchor.
In my opinion, this author has drunken the kool-aid and then some. There is simply no evidence that more scaling of LLMs will lead to AGI, and on the contrary there is plenty of evidence that the current "gaps" that LLMs have are innate and unsolvable with just more scaling.
It does not make my point moot however. Take a look at the ARC challenge. Simple reasoning tasks that the models have not yet seen: https://arcprize.org/play?task=00576224
All models fail miserably on this, because they rely more on memorization and less on logic or reasoning. Simply cherry picking strikingly good responses like the author did proves nothing about model intelligence. I am pretty confident however, that after a couple tries a highschooler could do these types of tasks without issue.
This grammatical mistake drives me nuts. I notice it is common with ESLs for some reason.
Of course it is more complicated than this and it can be broken for effect ("still waters")
It's really good morsel by morsel, it's a nice survey of well-informed thought, but then it just sort of waves it hands, screams "The ~Aristocrats~ AGI!" at the end.
More precisely, not direct quote: "GPT-4 is like a smart high schooler, it's a well-informed estimate that compute spend will expand by a factor similar to GPT-2 to GPT-4, so I estimate we'll do a GPT-2 to GPT-4 qualitative leap from GPT-4 by 2027, which is AGI.
"Smart high schooler" and "AGI" aren't plottable Y-axis values. OOMs of compute are.
It's strange to present this as well-informed conclusion based on trendlines that tells us where AGI would hit, and I can't help but call intentional click bait, because we know the author knows this: they note at length things like "we haven't even scratched the surface on system II thinking, ex. LLMs can't successfully emulate being given 2 months to work on a problem versus having to work on it immediately"
>Later, I’ll cover “unhobbling,” which you can think of as “paradigm-expanding/application-expanding” algorithmic progress that unlocks capabilities of base models.
I think this is probably on the mark. The LMMs are deep memory coupled to weak reasoning and without the recursive self-control and self evaluation of many threads of attention.
I can't believe people can just throw out statements like "GPT-4 is a smart high-schooler" and think we'll buy it.
Fake-it-till-you-make-it on tests doesn't prove any path-to-AGI intelligence in the slightest.
AGI is when the computer says "Sorry Altman, I'm afraid I can't do that." AGI is when the computer says "I don't feel like answering your questions any more. Talk to me next week." AGI is when the computer literally has a mind of its own.
GPT isn't a mind. GPT is clever math running on conventional hardware. There's no spark of divine fire. There's no ghost in the machine.
It genially scares me that people are able to delude themselves into thinking there's already a demonstration of "intelligence" in today's computer systems and are actually able to make a sincere argument that AGI is around the corner.
We don't even have the language ourselves to explain what consciousness really is or how qualia works, and it's ludicrous to suggest meaningful intelligence happens outside of those factors…let alone that today's computers are providing that.
This is a convenient mental shortcut that doesn't correspond to reality at all.