So I don't buy the engineering angle, I also don't think LLMs will scale up to AGI as imagined by Asimov or any of the usual sci-fi tropes. There is something more fundamental missing, as in missing science, not missing engineering.
The real philosophical headache is that we still haven’t solved the hard problem of consciousness, and we’re disappointed because we hoped in our hearts (if not out loud) that building AI would give us some shred of insight into the rich and mysterious experience of life we somehow incontrovertibly perceive but can’t explain.
Instead we got a machine that can outwardly present as human, can do tasks we had thought only humans can do, but reveals little to us about the nature of consciousness. And all we can do is keep arguing about the goalposts as this thing irrevocably reshapes our society, because it seems bizarre that we could be bested by something so banal and mechanical.
And now, I still don't know; the months go by and as far as I'm aware they're still pursuing these goals but I wonder how much conviction they still have.
I doubt it. Human intelligence evolved from organisms much less intelligent than LLMs and no philosophy was needed. Just trial and error and competition.
Original 80s AI was based on mathematical logic. And while that might not encompass all philosophy, it certainly was a product of philosophy broadly speaking - some analytical philosophers could endorse. But it definitely failed and failed because it could process uncertainty (imo). I think also if you closely, classical philosophy wasn't particularly amenable to uncertainty either.
If anything, I would say that AI has inherited its failure from philosophy's failure and we should look to alternative approaches (from Cybernetics to Bergson to whatever) for a basis for it.
How long until that gets more reliable than a simple database? How long until it can execute code faster than a CPU running a program?
A lot of the stuff humans accomplish is through technology, not due to growing a bigger brain. Even something seemingly basic like a math equation benefits drastically from being written down with pen&paper instead of being juggled in the human brain itself (see Extended mind thesis). And when it comes to something like running a 3D engine, there is pretty much no hope of doing it with just your brain.
Maybe we will get AIs smart enough that they can write their own tools, but for that to happen, we still need the infrastructure that allows them writing the tools in the first place. The way they can access Python is a start, but there is still a lack of persistence that lets them keep their accomplishments for future runs, be it in the form of a digital notepad or dynamic updating of weights.
The first line and the conclusion is: "The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin." [1]
I don't necessary agree with it's examples or the direction it vaguely points at. But it's basic statement seems sound. And I would say that there's lot of opportunity for engineer, broadly speaking, in the process of creating "general methods that leverage computation" (IE, that scale). What the bitter lesson page was roughly/really about was earlier "AI" methods based on logic-programming and which including information on the problem domain in the code itself.
And finally, the "engineering" the paper talks about actually is pro-Bitter lesson as far as I can tell. It's taking data routing and architectural as "engineering" and here I agree this won't work - but for the opposite reason - specifically 'cause I don't just data routing/process will be enough.
[1]https://www.cs.utexas.edu/~eunsol/courses/data/bitter_lesson...
I'd argue it's because intelligence has been treated as a ML/NN engineering problem that we've had the hyper focus on improving LLMs rather than the approach articulated in the essay.
Intelligence must be built from a first principles theory of what intelligence actually is.
Hinton thinks the 3rd is inevitable/already here and humanity is doomed. It's an odd arena.
Once we got the “Attention is all you need” paper I don’t remember anyone saying we couldn’t get better results by throwing more data and compute at it. But now we’ve pretty much thrown all the data and all (as much as we can reasonably manufacture) at it. So clearly we’re at the end of that phase.
I suppose one can argue about whether designing a new AGI-capable architecture and learning algorithm(s) is a matter of engineering (applying what we already know) or research, but I think saying we need new scientific discoveries is going to far.
Neural nets seems to be the right technology, and we've now amassed a ton of knowledge and intuition about what neural nets can do and how to design with them. If there was any doubt, then LLMs, even if not brain-like, have proved the power of prediction as a learning technique - intelligence essentially is just successful prediction.
It seems pretty obvious that the rough requirements for an neural-net architecture for AGI are going to be something like our own neocortex and thalamo-cortical loop - something that learns to predict based on sensory feedback and prediction failure, including looping and working memory. Built-in "traits" like curiosity (prediction failure => focus) and boredom will be needed so that this sort of autonomous AGI puts itself into leaning situations and is capable of true innovation.
The major piece to be designed/discovered isn't so much the architecture as the incremental learning algorithm, and I think if someone like Google-DeepMind focused their money, talent and resources on this then they could fairly easily get something that worked and could then be refined.
Demis Hassabis has recently given an estimate of human-level AGI in 5 years, but has indicated that a pre-trained(?) LLM may still be one component of it, so not clear exactly what they are trying to build in that time frame. Having a built-in LLM is likely to prove to be a mistake where the bitter lesson applies - better to build something capable of self-learning and just let it learn.
He said 50% chance of AGI in 5 years.
Id say better model architechture than more data. A human can learn to do things more complex than an LLM with less data. I think modelling the world as a static system to be representation learned in an unsupervised fashion is blocked on the static assumption. The world is dynamical, that should be reflected in the base model
But yeah, definitely not an engineering problem. Thats like saying the reason a crow isnt as smart as a person is becauss they dont have the hands to type of keyboards. But its also not because they havent seen enough of the world like your saying. Its be ause their brain isnt complex enough
I am thinking we need a foundation, something that is concrete and explicit and doesn't do hallucination. But has very limited knowledge outside of absolute Maths and basic physics.
The underlying assumption is that it exists in the first place. Or rather, one must first accept an axiom.
In fermi, its that interstellar signals can be detected and further travel is possible.
In AGI, its that intelligence is a isolateable process which we can bootstrap in minimal time.
Both assumes human progress are templates of unlimited exponential growth.
Did GPT-2 scale up to be an expert system ? No - it scaled up to be GPT-3
..
Did GPT-4 scale up to become AGI ? No - it scaled up to be GPT-5
Moreover, the differences between each new version are becoming increasingly less. We're reaching an asymptote because the more data you've trained on, natural or synthetic, the less is the impact of any incremental additions.
If you scale up an LLM big enough, then essentially what you'll get is GPT-5.
A good contrast is quantum computing. We know that's possible, even feasible, and now are trying to overcome the engineering hurdles. And people still think that's vaporware.
A discovery that AGI is impossible in principle to implement in an electronic computer would require a major fundamental discovery in physics that answers the question “what is the brain doing in order to implement general intelligence?”
So the question is whether human intelligence has higher-level primitives that can be implemented more efficiently - sort of akin to solving differential equations, is there a “symbolic solution” or are we forced to go “numerically” no matter how clever we are?
It need not even be incomputable, it could be NP hard and practically be incomputable, or it could be undecidable I.e. a version of the halting problem.
There are any number of ways our current models of mathematics or computation can in theory could be shown as not capable of expressing AGI without needing a fundamental change in physics
We don’t even have a workable definition, never mind a machine.
Presumably "brains" do not do many of the things that you will measure AGI by, and your brain is having trouble understanding the idea that "brain" is not well understood by brains.
Does it make it any easier if we simplify the problem to: what is the human doing that makes (him) intelligent ? If you know your historical context, no. This is not a solved problem.
Intelligence is an emergent phenomenon; all the interesting stuff happens at the boundary of order and disorder but we don’t have good tools in this space.
(I’m not saying it is, just that it’s possible)
If you believe in eg a mind or soul then maybe it's possible we cannot make AGI.
But if we are purely biological then obviously it's possible to replicate that in principle.
Whether is feasible or practical or desirable to achieve AGI is another matter, but the OP lays out multiple problem areas to tackle.
It can plan and take actions towards arbitrary goals in a wide variety of mostly text-based domains. It can maintain basic "memory" in text files. It's not smart enough to work on a long time horizon yet, it's not embodied, and it has big gaps in understanding.
But this is basically what I would have expected v1 to look like.
That wouldn't have occurred to me, to be honest. To me, AGI is Data from Star Trek. Or at the very least, Arnold Schwarzenegger's character from The Terminator.
I'm not sure that I'd make sentience a hard requirement for AGI, but I think my general mental fantasy of AGI even includes sentience.
Claude Code is amazing, but I would never mistake it for AGI.
For me, AGI is an AI that I could assign an arbitrarily complex project, and given sufficient compute and permissions, it would succeed at the task as reliably as a competent C-suite human executive. For example, it could accept and execute on instructions to acquire real estate that matches certain requirements, request approvals from the purchasing and legal departments as required, handle government communication and filings as required, construct a widget factory on the property using a fleet of robots, and operate the factory on an ongoing basis while ensuring reliable widget deliveries to distribution partners. Current agentic coding certainly feels like magic, but it's still not that.
I presuppose that you actually mean ASI as a starting point, and that is being charitable that it isn’t just pattern matching to questionable sci-fi.
What really occurs to me is that there is still so much can be done to leverage LLMs with tooling. Just small things in Claude Code (plan mode for example) make the system work so much better than (eg) the update from Sonnet 3.5 to 4.0 in my eyes.
I suspect most people envision AGI as at least having sentience. To borrow from Star Trek, the Enterprise's main computer is not at the level of AGI, but Data is.
The biggest thing that is missing (IMHO) is a discrete identity and notion of self. It'll readily assume a role given in a prompt, but lacks any permanence.
One thing I like about the Mass Effect universe is the depiction of the geth, which qualify as AI. Each geth unit is not run by a singular intelligent program, but rather a collection of thousands of daemons, each of which makes some small component of the robot's decisions on its own, but together they add up to a collective consciousness. When you look at how actual modern robotics platforms (such as ROS) are designed, with many processes responsible for sensors and actuators communicating across a common bus, you can see the geth as sort of an extrapolation of that idea.
I certainly don't. It could be that's necessary but I don't know of any good arguments for (or against) it.
Philosophy Professor: Who is asking?
Student: I am!
AGI is poorly defined and thus is a science "problem", and a very low priority one at that.
No amount of engineering or model training is going to get us AGI until someone defines what properties are required and then researches what can be done to achieve them within our existing theories of computation which all computers being manufactured today are built upon.
Am I incorrect?
Natural language processing is definitely a huge step in that direction, but that's kinda all we've got for now with LLMs and they're still not that great.
Is there some lower level idea beneath linguistics from which natural language processing could emerge? Maybe. Would that lower level idea also produce some or all of the missing components that we need for "cognition"? Also a maybe.
What I can say for sure though is that all our hardware operates on this more linguistic understanding of what computation is. Machine code is strings of symbols. Is this not good enough? We don't know. That's where we're at today.
It's possible to stumble upon a solution to something without fully understanding the problem. I think this happens fairly often, really, in a lot of different problem domains.
I'm not sure we need to fully understand human consciousness in order to build an AGI, assuming it's possible to do so. But I do think we need to define what "general intelligence" is, and having a better understanding of what in our brains makes us generally intelligent will certainly help us move forward.
The architecture has to allow for gradient descent to be a viable training strategy, this means no branching (routing is bolted on).
And the training data has to exist, you can't find millions of pages depicting every thought a person went through before writing something. And such data can't exist because most thoughts aren't even language.
Reinforcement learning may seem like the answer here: bruteforce thinking to happen. But it's grossly sample-inefficient with gradient descent and therefore only used for finetuning.
LLM's are regressive models and the configuration that was chosen where every token can only look back allows for very sample-efficient training (one sentence can be dozens of samples).
While you say reinforcement learning isn't a good answer, I think its the only answer.
But that recursive thought has a limit. For example: You can think about yourself thinking. With a little effort, you can probably also think about yourself thinking about yourself thinking. But you can't go much deeper.
With the advent of modern computing, we (as a species) have finally created a tool that can "think" recursively, to arbitrary levels of depth. If we ever do create a superintelligent AGI, I'd wager that its brilliance will be attributable to its ability to loop much deeper than humans can.
I guess smart people in big companies already consider this and are currently working on technologies for products That will include some form of electromagnetic brain sensing - Provided conveniently as an interface - but also usefully a source of this data.
It also suggests to me that AI/AGI is far more susceptible to traditional disruption than the narratives of established incumbents suggest. You could have a Kickstarter like killer product, including such a headset that would provide the data to bootstrap that startup’s super AI.
Exciting times!
It would be interesting if in the very distant future, it becomes viable to use advanced brain scans as training data for AI systems. That might be a more realistic intermediate between the speculations into AGI and Uploaded Intelligence.
</scifi>
Imagine if we had a LLM in the 15th century. It would happily explain the validity of the geocentric system. It can't get to heliocentrism. The same way modern LLMs can only tell us what we know and cant think, revolutionize, etc. They can be programmed to reason a bit, but 'reason' is doing a lot of heavy lifting here. The reasoning is just a better filter on what the person is asking or what is being produced for the most part and not an actual novel creative act.
The more time I spend with LLM's the more they feel like google on steroids. I just am not seeing how this type of system could ever lead to AGI, and if anything, probably is eating away at any remaining AGI hype and funding.
What if intelligence requires agency ?
So models improve on specific tasks, but they don't really improve generally across the board any longer.
Add an extra leg to any animal in a picture. Ask the vision LLM to tell you how many legs it sees. It will answer the same amount as a person would expect from a healthy individual, because it's not actually reasoning, it's not perceiving anything, it's pattern matching. It sees dog, it answers 4 legs. Maybe sometime in the future it won't do that, because they will add this kind of trick to their benchmaxxing set (training LLMs specifically on pictures that have less or more legs than the animal should), as they do every time there's a new generation of those illusory things. But that won't fix the fundamental that these things DO NOT REASON.
Training LLMs on sets of thousands and thousands and thousands of reasoning trick questions people ask on LM arena is borderline scamming people on the true nature of this technology. If we lived in a sane regulatory environment OAI would have a lot to answer for.
An unfortunate tendency that many in high-tech suffer from is the idea that any problem can be solved with engineering.
In the meantime I guess all the AI companies will just keep burning compute to get marginal improvements. Sounds like a solid plan! The craziest thing about all of this is that ML researchers should know better!! Anyone with extensive experience training models small or large knows that additional training data offers asymptotic improvements.
But even if LLMs are going to tap out at some point, and are a local maximum, dead-end, when it comes to taking steps toward AGI, I would still pay for Claude Code until and unless there's something better. Maybe a company like Anthropic is going to lead that research and build it, or maybe (probably) it's some group or company that doesn't exist yet.
Take memory for example: give LLM a persistent computer and ask it to jot down its long-term memory as hierarchical directories of markdown documents. Recalling a piece of memory means a bunch of `tree` and `grep` commands. It's very, very rudimentary, but it kinda works, today. We just have to think of incrementally smarter ways to query & maintain this type of memory repo, which is a pure engineering problem.
Just hand waving some “distributed architecture” and trying to duct tape modules together won’t get us any closer to AGI.
The building blocks themselves, the foundation, has to be much better.
Arguably the only building block that LLMs have contributed is that we have better user intent understanding now; a computer can just read text and extract intent from it much better than before. But besides that, the reasoning/search/“memory” are the same building blocks of old, they look very similar to techniques of the past, and that’s because they’re limited by information theory / computer science, not by today’s hardware or systems.
Probably need another cycle of similar breakthrough in model engineering before this more complex neural network gets a step function better.
Moar data ain’t gonna help. The human brain is the proof: it doesnt need the internet’s worth of data to become good (nor all that much energy).
We can certainly get much more utility out of current architectures with better engineering, as "agents" have shown, but to claim that AGI is possible with engineering alone is wishful thinking. The hard part is building systems that showcase actual intelligence and reasoning, that are able to learn and discover on their own instead of requiring exorbitantly expensive training, that don't hallucinate, and so on. We still haven't cracked that nut, and it's becoming increasingly evident that the current approaches won't get us there. That will require groundbreaking compsci work, if it's possible at all.
What is that? What could merely require light elementary education and then it takes off and self improves to match and surpass us? That would be artificial comprehension, something we've not even scratched. AI and trained algorithms are "universal solvers" given enough data, This AGI would be something different, this is understanding, comprehending. Instantaneous decomposition of observations for assessment of plausibility, and then recombination for assessment of combination plausibility - all continual and instant for assessment of personal safety: all that happens in people continually while awake. Be that monitoring of personal safety be for physical or loss of client during sales negotiation. Our comprehending skills are both physical and abstract. This requires a dynamic assessment, an ongoing comprehension that is validating observations as a foundation floor, so a more forward train of thought, a "conscious mind" can make decisions without conscious thought about lower level issues like situational safety. AGI needs all that dynamic comprehending capability, to satisfy its name of being general.
That's not how natural general intelligences work, though.
So then, if we can cook a chicken like this, we can also heat a whole house like this during winters, right? We just need a chicken-slapper that's even bigger and even faster, and slap the whole house to heat it up.
There's probably better analogies (because I know people will nitpick that we knew about fire way before kinetic energy), so maybe AI="flight by inventing machines with flapping wings" and AGI="space travel with machines that flap wings even faster". But the house-sized chicken-slapper illustrates how I view the current trend of trying to reach AGI by scaling up LLMs.
But I still see all the same debates around AGI - how do we define it? what components would it require? could we get there by scaling or do we have to do more? and so on.
I don't see anyone addressing the most truly fundamental question: Why would we want AGI? What need can it fulfill that humans, as generally intelligent creatures, do not already fulfill? And is that moral, or not? Is creating something like this moral?
We are so far down the "asking if we could but not if we should" railroad that it's dazzling to me, and I think we ought to pull back.
The morality of it depends on the details.
I doubt very much we will ever build a machine that has perfect knowledge of the future or that can solve each and every “hard” reasoning problem, or that can complete each narrow task in a way we humans like. In other words, it’s not simply a matter of beating benchmarks.
In my mind at least, AGI’s definition is simple: anything that can replace any human employee. That construct is not merely a knowledge and reasoning machine, but also something that has a stake on its own work and that can be inserted in a shared responsibility graph. It has to be able to tell that senior dev “I know planning all the tasks one year in advance is busy-work you don’t want to do, but if you don’t, management will terminate me. So, you better do it, or I’ll hack your email and show everybody your porn subscriptions.”
That is their goal function they are trained for, it is like dopamine and sex for humans they will do anything to get it.
Imitating humans would be one way to do it, but it doesn't mean it's an ideal or efficient way to do it.
All of our current approaches "emulate" but do not "execute" general intelligence. The damning paper above basically concludes they're incredible pattern matching machines, but thats about it.
For instance it is becoming clearer that you can build harnesses for a well-trained model and teach it how to use that harness in conjunction with powerful in-context learning. I’m explicitly speaking of the Claude models and the power of whatever it is they started doing in RL. Truly excited to see where they take things and the continued momentum with tools like Claude Code (a production harness).
(Also, LLMs don't have beliefs or other mental states. As for facts, it's trivially easy to get an LLM to say that it was previously wrong ... but multiple contradictory claims cannot all be facts.)
You have to implement procedurality first (e.g. counting, after proper instancing of ideas).
We implemented computing without any need of a brain-neural theory of arithmetic.
I really think it is not possible to get that from a machine. You can improve and do much fancier than now.
But AGI would be something entirely different. It is a system that can do everything better than a human including creativity, which I believe it to be exclusively human as of now.
It can combine, simulate and reason. But think out of the box? I doubt so. It is different to being able to derive ideas from which human would create. For that it can be useful. But that would not be AGI.
we are talking explicitly about a.g.i here, not debating if the computer can solve a majority of problems or not.
the two things can be true at the same time.
Why some people understood when they tried it with blockchain, nfts, web3, AR, ... any good engineer should know principle of energy efficiency instead of having faith in the Infinite monkey theorem
Not sure why people insist that the state of AI 2-3 years ago still applies today.
I see. So the author rejects the hypothesis of emergent behavior in LLM, but somehow thinks it will magically appear if the "engineering" is correct.
Self contradictory.
Because if they don't, I honestly don't think they can approach AGI.
I have the feeling it's a common case of lack of humility from an entire field of science who refuses to look at other fields to understand what they're doing.
Not to mention how to define intelligence in evolution, epistemology, ontology, etc.
Approaching AI with a silicon valley mindset is not a good idea.
I don’t see a problem, we’re great at just reinventing all that stuff from first principles
Right now, LLMs feel like they’re at the same stage as raw FLOPs; impressive, but unwieldy. You can already see the beginnings of "systems thinking" in products like Claude Code, tool-augmented agents, and memory-augmented frameworks. They’re crude, but they point toward a future where orchestration matters as much as parameter count.
I don’t think the "bitter lesson" and the "engineering problem" thesis are mutually exclusive. The bitter lesson tells us that compute + general methods win out over handcrafted rules. The engineering thesis is about how to wrap those general methods in scaffolding that gives them persistence, reliability, and composability. Without that scaffolding, we’ll keep getting flashy demos that break when you push them past a few turns of reasoning.
So maybe the real path forward is not "bigger vs. smarter," but bigger + engineered smarter. Scaling gives you raw capability; engineering decides whether that capability can be used in a way that looks like general intelligence instead of memoryless autocomplete.
Brains are continuous - they don’t stop after processing one set of inputs, until a new set of inputs arrives.
Brains continuously feed back on themselves. In essence they never leave training mode although physical changes like myelination optimize the brain for different stages of life.
Brains have been trained by millions of generations of evolution, and we accelerate additional training during early life. LLMs are trained on much larger corpuses of information and then expected to stay static for the rest of their operational life; modulo fine tuning.
Brains continuously manage context; most available input is filtered heavily by specific networks designed for preprocessing.
I think that there is some merit that part of achieving AGI might involve a systems approach, but I think AGI will likely involve an architectural change to how models work.
Here, AGI is being described as an engineering problem, in contrast to a “model training” problem. That is, I think at least, he’s at least saying that more work needs to be done at an R&D level. I agree with those who are saying it is maybe not even an engineering problem yet, but should be noted that he’s at least pushing away from just running the existing programs harder, which seems to be the plan with trillions of dollars behind it.
This part could do with sourcing. I think it seems clearly untrue. We only have three types of benchmark: a) ones that have been saturated, b) ones where AI performance is progressing rapidly, c) really newly introduced ones that were specifically designed for the then-current frontier models to fail on. Look at for example the METR long time horizon task benchmark, which is one that's particularly resistant to saturation.
The entire article is claimed on this unsupported but probably untrue claim, but it's a bit hard to talk about when we don't have any clue about why the author thinks this is true.
> The path to artificial general intelligence isn’t through training ever-larger language models
Then it's a good thing that it's not the path most of the frontier labs are taking. It appears to be what xAI is doing for everything, and it was probably what GPT-4.5 was. Neither is a particularly compelling success story. But all the other progress over the last 12-18 months has come from models the same size or smaller advancing the frontier. And it has come from exactly the kind of engineering improvements that the author claims need to happen, both of the models and the scaffolding around the models. (RL on chain of thought, synthetic data, distillation, model-routing, tool use, subagents).
Sorry, no, they're not exactly the same kind of engineering improvements. They're the kind of engineering improvements that the people actually creating these systems though would be useful and actually worked. We don't see the failed experiments, and we don't see the ideas that weren't well-baked enough to even experiment on.
Will AGI require ‘consciousness’, another poorly understood concept? How are mammalian brain even wired up? The most advanced model is the Allen Institute’s Mesoscale Connectivity Atlas which is at best a low resolution static roadmap, not a dynamic description of how a brain operates in real time. And it describes a mouse brain, not a human brain which is far, far more complex, both in terms of number of parts, and architecture.
People are just finally starting to acknowledge LLMs are dead ends. The effort expended on them over the last five years could well prove a costly diversion along the road to AGI, which is likely still decades in the future.
Since there is no possibility of an uncontroversial specification of intelligence, there is no possibility of an honest and competent tester coming to the conclusion that sufficient testing has been performed on a candidate AGI to certify it.
Similarly, you won’t get any security professional to say that system is guaranteed to be secure.
Moreover, you can’t get any honest psychologist to swear that any person definitely has no mental illness.
How companies are dealing with this is to bet that they can fool enough people so that the remaining skeptics can be safely ignored.
Intelligence must be built from a first principles theory of what intelligence actually is.
The missing science to engineer intelligence is composable program synthesis. Aloe (https://aloe.inc) recently released a GAIA score demonstrating how CPS dramatically outperforms other generalist agents (OpenAI's deep research, Manus, and Genspark) on tasks similar to those a knowledge worker would perform.
We don’t even know how.
We've lucked into these amazing abilities by just scaling.
But we don't really understand how they work.
And they are obviously missing a piece, some self-reflection, or continuous-loop operation perhaps, which we again don't understand.
Perhaps we'll do all this engineering and luck the solution again, but I think probably not.
IME it’s both though. Better models, bigger models, and infrastructure all help get to AGI.
Here are the metrics by which the author defines this plateau: "limited by their inability to maintain coherent context across sessions, their lack of persistent memory, and their stochastic nature that makes them unreliable for complex multi-step reasoning."
If you try to benchmark any proxy of the points above, for instance "can models solve problems that require multi steps in agentic mode" (PlanBench, BrowseComp, I've even built custom benchmarks), the progress between models is very clear, and shows no sign of slowing down.
And this does convert to real-world tasks : yesterday, I had GPT-5 build me complex react charts in one-shot, whereas previous models needed more constant supervision.
I think we're moving goalposts too fast for LLMs, that's what can lead us to believe they've plateaued : but just try using past models for your current tasks (you can use use open models to be sure they were not updated) and see them struggle.
Continuing to want to make a non-deterministic system behave like a deterministic system will be interesting to watch.
AGI would take making at least one full brain, and then putting many of those working together, efficiently.
I don't believe we can engineer our way out of that before explaining how the f. the wetware works first.
Trying to model AGI off how humans think, without including emotion as a fundamental building block, is like trying to build a computer that'll run without electricity. People are emotional beings first. So much of how we learn that something is good or bad is due to emotion.
In an AGI context that means:
Happiness: how do I build an unguided feedback mechanism for reward?
Fear: how do I build an unguided feedback mechanism to instruct to flee?
Sadness: how do I build an unguided feedback mechanism to instruct to seek external support?
Anger: how do I build an unguided feedback mechanism to push back on external entities that violate expectations?
Disgust: how do I build an unguided feedback mechanism to instruct to avoid?
Maybe building artificial emotions is an engineering problem. Maybe not. But approaches that avoid emotion entirely seem ill-advised.the idea that you would somehow produce intelligence by feeding billions of reddit comments into a statistical text model is will go down as the biggest con in history
(so far)
If we want to learn, look to nature, and it *has to be alive*.
But I feel this person falls short immediately, because they don't study neuroscience and psychology. That is the big gap in most of these discussion. People don't discuss things close to the origin.
We have to account for first principals in how intelligence works, starting from the origin of ideas and how humans process their ideas in novel ways that create amazing tech like LLM! :D
How Intelligence works
In Neuroscience, if you try to identify the origin of where and how thoughts are formed and how consciousness works. It is completely unknown. This brings up the argument, do humans have free will if we are driven by these thoughts of unknown origin? That's a topic for another thread.
Going back to intelligence. If you study psychology and what forms intelligence, there are many human needs that drive intelligence, namely intellectual curiosity (need to know), deprivation sensitivity (need to understand), aesthetic sensitivity, absorption, flow, openness to experience.
When you look at how a creative human with high intelligence uses their brain, there are 3 networks involved. Default mode network (imagination network), executive attention network and salience network.
The executive attention network controls the brains computational power. It has a working memory that can complete tasks using goal directed focus.
A person with high intelligence can alternate between their imagination and their working memory and pull novel ideas from their imagination and insert them into their working memory - frequently experimenting by testing reality. The salience network filters unnecessary content while we are using our working memory and imagination.
How LLMs work
Neural networks are quite promising in their ability to create a latent manifold within large datasets that interpolates between samples. This is the basis for generalization, where we can compress a large dataset in a lower dimensional space to a more meaningful representation that makes predictions.
The advent of attention on top of neural networks, to identify important parts of text sequences, is the huge innovation powering llms today. The innovation that emulates the executive attention network.
However, that alone is a long distance from the capabilities of human intelligence.
With current AI systems, there is the origin, which is known vocabulary with learned weights coming from neural networks, with reinforcement learning applied to enhance the responses.
Inference comes from an autoregressive sequence model that processes one token at a time. This comes with a compounding error rate with longer responses and hallucinations from counterfactuals.
Correct response must be in the training distribution.
As Andy Clark said, AI will never gain human intelligence, they have no motivation to interface with the world and conduct experiments and learn things on their own.
I think there are too many unknown and subjective components of human intelligence and motivation that cannot be replicated with the current systems.