Concentrated capital is truly a wild thing.
DNA doesn't understand intelligence.
"The question of whether a computer can think is no more interesting than the question of whether a submarine can swim." - Edsger Dijkstra.
And what we have now, with LLMs, met the standard I had for AGI just five years ago.
Only thing that changed for me since then is noticing that not only does everyone have a different definition of what AGI even is, also none of the three initials are even boolean-valued: "how intelligent" can be a number, or even a vector that varies by domain as linguistic intelligence doesn't have to match spatial intelligence; "how general" could be what percentage of human cognitive tasks the AI can do; "how artificial" is answered entirely differently by those who care differently about learning from first principles vs. programmed algorithms vs. learning from reading the internet.
> People were misled by graphs that they were told pointed to a singularity right around the corner, despite obvious errors in the extant systems.
Tautology.
If there weren't obvious errors in the extant systems… there wouldn't be anything left to do.
Is this a requirement for achieving AGI? The history of progression of the ML field indicates that the answer is "no". We don't really understand how concepts are encoded in today's models, yet that doesn't stop them from being economically useful. So why would the special case of AGI be any different?
The specific term "AGI" has always referred to a program that strictly matches to human intelligence, with no significant areas where it's worse than a typical human. I agree that we're not there and not obviously close to there. But the idea that it's all a parlor trick and LLMs have no cognition at all seems obviously false to me.
Today, LLMs being trained to output text, images and video equal to or better than any human is now what we consider to be AI.
There is a semantic game we play to move the goalposts at every new tick of these technologies so that humans are still on top.
Maybe the next big trick is to define AGI as having a system that was not trained on the entire corpus of human output available on the internet, and train it with nothing. Can an AI be “born” with no model like a baby and learn to speak, walk, function in society without being primed for success with a state of the art model? Could you drop this AI into any human society, regardless of culture, and have it seamlessly integrate?
What precisely we consider “cognition” feels like a philosophical debate, one where we have no single true answer. And doesn’t really matter?
"Nerds with a thousand GPUs,
Nerds with a thousand streams,
with you only I experience,
the love, the chat of my dreams!"
"So prompt me, and list me,
index me, repeat me,
I'm yours till I elide,
so in love, so in love,
so in love with you, my GPT, am I. "
-With apologies to Cole Porter and anyone blinded by reading this post.
Literally millions or billions of people are better than me at a significant number of intellectual tasks
Having said that, 4.5 is clearly a misstep, one that should realign the goals of the AI industry to focus more on utility and cost effectiveness.
Not only can they not reach AGI, they cannot reach their own definition of AGI. But people will still gobble whatever next lie Altman will sell them for $2,000 a month.
Absolutely pathetic.
that definition likely not their's but came from Microsoft when they dumped 10B into OAI and was a condition for revenue sharing clause.
One can argue that AGI is already achieved, LLMs are more proficient and general in knowledge tasks than any specific individual.
Ans. No, no and the entirety of human cognition.
ChatGPT et al contain collections of words. When prompted they generate word sequences found in known word sources. That's all. They don't observe or reason about the world. They don't even reason about words. They merely append the most likely next word to a sequence of words.
I still believe GPT-4 has some general intelligence, just a very tiny amount. It can take what it was trained on and slightly modify it to answer a slightly novel question.
And me running is closer to the speed of light than me walking, yet neither are even remotely close to the speed of light.
Also, it’s inefficient allocation of capital. Every cent being spent on this is money that could be spent on something useful (of course, absent the AI bubble, not _all_ of it would be, but some of it would be).
That said, annoying people will move on to hyping something else.
Manhattan Project: Started 1942, combat use 1945.
Solid state transistor: Invented 1954, in radios 1955.
Hybrid cars, iPhone, etc etc.
Orion does seem to have been a failure, but I also find it a bit weird that they seemingly decided to release the full model rather than a distillation, which is the pattern we now usually see with foundation models.
So, did they simply decide that it wasn’t worth the effort and dedicate the compute to other, better things? Were they pushed by sama to release it anyway, to look like they were still making progress while developing something really next gen?
I agree that OpenAI's endless hyping about AGI seems pretty unrealistic, but let's take a breather here. There are no major research projects where you don't run into setbacks and failures before you reach your goals.
And even if they never really progress beyond where they are today with their current models, just bringing down the cost could open up a lot of doors for useful applications.
Someone will figure out how to make AIs understand their own ignorance and stop bullshitting when they don't know something, someone will figure out how to make AIs learn on the fly instead of fine-tuning new model versions, etc.
This layer itself will inevitably see its cost come down per unit of use on a long road towards commoditization. It will probably get better and better and more sophisticated but again the value will be primarily up stack, not accrued primarily from a company like this. It's not to say they couldn't be a great company... even Google is a great company that has enabled countless other companies to bloom. The myopic way people look to these one size fit all companies is just so disconnected from our economy works.
If we are waiting for a new breakthrough architecture, it could be decades. Our brains send signals with very high concurrency to individual processors (neurons). Each neuron is massively more complex than a single ReLU function and we have billions of them. If we need that kind of parallel processing and data transfer to match human thought and flexibility, it could be another 70 years before we see AGI.
That said, I do think LLMs are one of the biggest AI breakthroughs since the inception of AI. And I am sure that it, or something very similar will be part of an eventual AGI.
I could not agree more. There is much value still to be gained from blockchain technology!
The compute for training is beginning to seem a poor investment since it is depreciating fast and isn't producing value in this case. That's a seriously big investment to make if it's not productive but since a lot of it actually belongs to Azure they could cut back here fast if they had to. I hope they won't because in the hands of good researchers there is still a real possibility that they'll use the compute to find some kind of technical innovation to give them a bigger edge.
So does Google. And Google can roll out their premium models into phones, household devices, cars and online platforms to add value.
OpenAI has a website.
It's not even close.
A business that burns money at the rate OpenAI does, without any clear path to profitability, will eventually die.
I think you are wrong that this is the limit for single shot though. We have reached a limit due to data, but I expect what will happen next is that we will transition into a slow growth phase where chain of thought models will be used to effectively create more training data, which will then be used to train a better single shot model, which will then be extended into a better chain of thought model, which will then produce higher quality training data for another single shot. And so on. Kind of like what happens as knowledge passes from teacher to student across successive generations. Effectively a continuing process of compression and growth in intelligence, but progressing rather slowly compared to what we have seen in the last 5 years.
Right now, these models are built by the establishment to serve the establishment—and that’s a load of shit that needs to change. The real fun starts the day some random group of “anonymous” on 4chan can train one of these models to generate incredibly convincing deepfakes of world leaders, all using the compute power in their own homes.
Power to the people. Fuck the system, and all that jazz.
But here, I think he's right about business matters. The massive investment in computing capacity we've seen in recent years, by Open AI and others, can generate positive returns only if the technology continues to improve rapidly so it can overcome its limitations and failure modes in the short run.
If the rate of improvement has slowed down, even temporarily, OpenAI and others like Anthropic are likely to face financial difficulties.
---
[a] In the words of Geoff Hinton: https://www.youtube.com/watch?v=d7ltNiRrDHQ
---
Note: At the moment, the OP is flagged. To the mods: It shouldn't be, because it conforms to the HN guidelines.
Also, even though LLMs can generate text much faster than humans, we may be internally thinking much faster. Each adult human brain has over 100 billion neurons and 100 trillion synapses, and each has been working every moment, for decades.
This is what separates human reasoning from LLM reasoning, and it can’t be solved by scaling the latter to anything feasible.
I wish AI companies would take a decent chunk of their billions, and split it into 1000+ million-dollar projects that each try a different idea to overcome these issues, and others (like emotion and alignment). Many of these projects would certainly fail, but some may produce breakthroughs. Meanwhile, spending the entire billion on scaling compute has failed and will continue to fail, because everyone else does that, so the resulting model has no practical advantages and makes less money than it cost to train before it becomes obsoleted by other people’s breakthroughs.
Disclosure - I am neither bearish or a mega bull on LLMs. LLMs useful in some cases.
A smart enough AI would summarize each of his posts as "I still hate the current AI boom".
There must be a term for such writers? He's certainly consistently on message.
D'oh!