He is comparing energy spend during inference in humans with energy spend during training in LLM's.
Humans spend their lifetimes training their brain so one would have to sum up the total training time if you are going to compare it to the training time of LLM's.
At age 30 the total energy use of the brain sums up to about 5000 Wh, which is 1440 times more efficient.
But at age 30 we didn't learn good representations for most of the stuff on the internet so one could argue that given the knowledge learned, LLMs outperform the brain on energy consumption.
That said, LLM's have it easier as they are already learning from an abstract layer (language) that already has a lot of good representations while humans have to first learn to parse this through imagery.
Half the human brain is dedicated to processing imagery, so one could argue the human brain only spend 2500 Wh on equivalent tasks which makes it 3000x more efficient.
Liked the article though, didn't know about HNSW's.
Edit: made some quick comparisons for inference
Assuming a human spends 20 minutes answering in a well-thought out fashion.
Human watt-hours: 0.00646
GPT-4 watt-hours (openAI data): 0.833
That makes our brains still 128x more energy efficient but people spend a lot more time to generate the answer.
Edit: numbers are off by 1000 as I used calories instead of kilocalories to calculate brain energy expense.
Corrected:
human brains are 1.44x more efficient during training and 0.128x (or 8x less efficient) during inference.
LLMs are always an additional cost, never more efficient because they add to the calculation, if you look at it that way.
ChatGPT has to deal with the languages we already created, it doesn't get to co-adapt.
I don't think this is true personally, ideally as children, we spend out time having fun and learning about the world is a side effect. This borg like thinking applied to intelligence because we have LLMs is unusual to me.
I learned surfing through play and enjoyment, not through training like a robot.
We can train for something with intention, but I think that is mostly a waste of energy, albeit necessary on occasion.
What do you think "play" is? Animals play to learn about themselves and the world, you see most intelligent animals play as kids with the play being a simplification of what they do as adults. Human kids similarly play fight, play build things, play cook food, play take care of babies etc, it is all to make you ready for an adult life.
Playing is fun since playing helps us learn, otherwise we wouldn't evolve to play, we would evolve to be like ants that just work all day long if that was more efficient. So the humans who played around beat those who worked their ass off, otherwise we would all be hard workers.
I think the part of this that resonates as most true to me is how this reframes learning in a way that tracks truth more closely. It's not all the time, 100% of the time, it's in fits and starts, its opportunistic, and there are long intervals that are not active learning.
But the big part where I would phrase things differently is in the insistence that play in and of itself is not a form of learning. It certainly is, or certainly can be, and while you're right that it's something other than Borg-like accumulation I think there's still learning happening there.
We don't know how to fully operate a human brain when it's fully disconnected from eyes, a mouth, limbs, ears and a human heart.
That doesn't sound right... 30 years * 20 Watts = 1.9E10 Joules = 5300 kWh.
Humans who spend a long time doing inference have not fully learned the thing being inferred - unlike LLMs, when we are undertrained, rather than a huge spike in error rate, we go slower.
When humans are well trained, human inference absolutely destroys LLMs.
This isn't an apt comparison. You are comparing a human trained in a specific field to an LLM trained on everything. When an LLM is trained with a narrow focus as well, human brain cannot compete. See Garry Kasparov vs Deep Blue. And Deep Blue is very old tech.
I suppose they intended that as a back-of-the-envelope starting point rather than a strict claim however. But even so, gotta be accountable to your starting assumptions, and I think a lot changes when this one is reconsidered.
We probably need to exclude the cerebellum as well (which is 50% of the neurons in the brain) as it’s used for error correction in movement.
Realistically you probably just need a few parts of the lambic system. Hippocampus, amygdala, and a few of the deep brain dopamine centers.
Yes we have learnt far more complex stuff, ffs.
i.e. not many humans invent calculus or relativity from scratch.
I think OP's point stands - these comparisons end up being overly hand-wavey and very dependent on your assumptions and view.
So yeah, you do use 2000 calories a day, but unless you live in an isolated jungle tribe, vast amounts of energy are consumed on delivering you food, climate control, electricity, water, education, protection, entertainment and so on.
I've come to the conclusion that gpt and gemini and all the others are nothing but conversational search engines. They can give me ideas or point me in the right direction but so do regular search engines.
I like the conversation ability but, in the end, I cannot trust their results and still have to research further to decide for myself if their results are valid.
I just go into the notebook tab (with an empty textarea) and start writing about a topic I’m interested in, then hit generate. It’s not a conversation, just an article in a passive form. The “chat” is just a protocol of in a form of an article with a system prompt at the top and “AI: …\nUser: …\n” afterwards, all wrapped into a chat ui.
While the article is interesting, I just read it (it generates forever). When it goes sideways, I stop it and modify the text in a way that fits my needs, in a recent place or maybe earlier, and then hit generate again.
I find this mode superior to complaining to a bot, since wrong info/direction doesn’t spoil the content. Also you don’t have to wait or interrupt, it’s just a single coherent flow that you can edit when necessary. Sometimes I stop it at “it’s important to remember …” and replace it with a short disclaimer like “We talked about safety already. Anyway, back to <topic>” and hit generate.
Fundamentally, LLMs generate texts, not conversations. Conversations just happen to be texts. It’s something people forget / aren’t aware of behind these stupid chat interfaces.
Reminds me of a similar argument about correctly pricing renewable power: since it isnt always-on (etc.) it requires a variety of alternative systems to augment it which aren't priced in. Ie., converting entirely to renewables isnt possible at the advertised price.
In this sense, we cannot "convert entirely to LLMs" for our tasks, since there's still vast amounts of labour in prompt/verify/use/etc.
Another thing a search engine cannot do that I use ChatGPT for on a daily basis is taking unstructured text and convert it into a specified JSON format.
I can do the opposite.
I can click the first result 1 billion times faster.
At this point it's just wasting people's times.
It’s exactly that for me, a conversational search engine. And the article explains it right, it’s just words organized in very specific ways to be able to retrieve them with statistical accuracy and the transformer is the cherry on top to make it coherent
You have a rough mathematical approximation of what's already a famously unreliable system. Expecting complete accuracy instead of about-rightness from it seems mad to me. And there are tons of applications where that's fine, otherwise our civilization wouldn't be here today at all.
And then you tell it such an API/case/etc doesn't exist. And it'll immediately acknowledge its mistake, and ensure it will work to avoid such in the future. And then literally the next sentence in the conversation it's back to inventing the same nonsense again. This is not like a human because even with the most idiotic human there's an at least general trend to move forward - LLMs are just coasting back on forth based on their preexisting training with absolutely zero ability to move forward until somebody gives them a training set to coast back and forth on, and repeat.
If a computer does not understand words, neither does your brain. While electromagnetic charge in the brain does not at all correspond with electromagnetic charge in a GPU, they do share an abstraction level, unlike words vs bits.
Computers right now do not understand language, but that does not mean that they cannot. We don't know what it takes to bridge the gap from stochastic parrot to understanding in computers, however from the mistakes LLMs make right now, it appears we have not found it yet.
It is possible that silicon based computer architecture cannot support the processing and information storage density/latency to support understanding. It's hard to guage the likelihood this is true given how little we know about how understanding works in the brain.
Each neurone is itself a complex combination of chemicals cycles; these can be, and have been, simulated.
The most complex chemicals in biology are proteins; these can be directly simulated with great difficulty, and we've now got AI that have learned to predict them much faster than the direct simulations on a classical computer ever could.
Those direct simulations are based on quantum mechanics, or at least computationally tractable approximations of it; QM is lots of linear algebra and either a random number generator or superdeterminism, either of which is still a thing a computer can do (even if the former requires a connection to a quantum-random source).
The open question is not "can computers think?", but rather "how detailed does the simulation have to be in order for it to think?"
Our brains are the product of the same dumb evolutionary process that made every other plant and animal and fungus and virus. We evolved from animals capable of only the most basic form of pattern recognition. Humans in the absence of education are not capable of even the most basic reasoning. It took us untold thousands of years to figure out that “try things and measure if it works” is a good way to learn about the world. An intelligent species would be able to figure things out by itself our ancestors, who have the same brain architecture we do, were not able to figure anything out for generation after generation. So much for our ability to do original independent thinking.
It's a combination of what you have already seen, read about or heard of, isn't it?
First we must lay down certain axioms (smart word for the common sense/ground rules we all agree upon and accept as true).
One of such would be the fact that currently computers do not really understand words. ...
The author is at least honest about his assumptions. Which I can appreciate. Most other people just has it as a latent thing.For articles like this to be interesting, this can not be accepted as an axiom. It's justification is what's interesting,
> If you believe LLM have qualia, you also believe a ...
You use the word believe twice here. I am actively not talking about beliefs.
I just realise, that the author indeed gave themselves an out:
> ... currently computers do not really understand words.
The author might believe that future computers can understand words. This is interesting. Questions being _what_ needs to be in order for them to understand? Could that be an emergent feature of current architectures? That would also contradict large parts of the article.
While practice, axioms are often statements that we all agree on and accept as true, that isn't necessarily true and isn't the core of it's meaning.
Axioms are something we postulate as true, without providing an argument for its truth, for the purposes of making an argument.
In this case, the assertion isn't really used as part of a argument, but to bootstrap an explanation of how words are represented in LLMs.
Edit: I find this so amusing because it is an example of learning a word without understanding it.
Uhm… no?
They are literally things that can't be proven but allow us to prove a lot of other things.
Yes, we attach meaning to certain words based on previous experience, but we do so in the context of a conscious awareness of the world around us and our experiences within it. An LLm doesn't even have a notion of self, much less a mechanism for attaching meaning to words and phrases based on conscious reasoning.
Computers can imitate understanding "pretty well" but they have nothing resembling a pretty good or bad or any kind of notion of comprehension about what they're saying.
You have kids talking to this thing asking it to teach them stuff without knowing that it doesn't understand shit! "How did you become a doctor?" "I was scammed. I asked ChatGPT to teach me how to make a doctor pepper at home and based on simple keyword matching it got me into medical school (based on the word doctor) and when I protested that I just want to make a doctor pepper it taught me how to make salsa (based on the word pepper)! Next thing you know I'm in medical school and it's answering all my organic chemistry questions, my grades are good, the salsa is delicious but dammit I still can't make my own doctor pepper. This thing is useless!
/s
If LLMs were capable of understanding, they wouldn't be so easy to trick on novel problems.
Firstly, do understand that I am not saying that LLMs (or ChatGPT) do understand.
I am merely saying that we don't have any sound frameworks to assess it.
For the rest of your rant: I definitely see that you don't derive any value from ChatGPT. As such I really hope you are not paying for it - or wasting your time on it. What other people decide to spend their money on is really their business. I don't think any normal functioning people have the expectation that a real person is answering them when they use ChatGPT - as such it is hardly a fraud.
But it rather seems a good general introduction into the realm aimed at beginners. Not sure if it gets everything right and the author clearly states he is not an expert and would like correction where he is wrong, but it seems worth checking out, if one is interested in understanding a bit about the magic behind it.
Clickholes get too many votes.
To paraphrase, I will not excuse such a long letter, for you had more time to write a shorter one.
power per hour makes no sense, since power is already energy (in Joule) per unit of time (second).
But it also compares one human with the whole GTP-4. It's like comaring a limonade stand with Coca Cola Inc.
It’s just so much more efficient than running their AI control software on silicon-based hardware!
My brain uses quantum mechanics for protein folding, my mind cannot perform the maths of QM.
I guess it was misspelling rather than an allusion to the Roman stone pillars for distance measurement https://en.m.wikipedia.org/wiki/Milion
No, it would not.
But that doesn't change your point, as there's no reason to require an intelligence to create evolution.
The analogy works, but not very far.
The same applies to LLMs in a way. If you calculate their capabilities to some arbitrary extreme of back--end inputs and ability based on the humans building them and all that they can do, you can arrive at a whole range of results for how capable and energy-efficient they are, but it wouldn't change the fact that the human brain as its own device does enormously more with much less energy than any LLM currently in existence. Our evolutionary path to that ability is secondary to it, since it's not a direct part of the brain's material resources in any given context.
The contortions by some to give equivalency between human brains and LLMs are absurd when the very blatantly obvious reality is that our brains are absurdly more powerful. They're also of course capable of self-directed, self-aware cognition, which by now nobody in their rational mind should be ascribing to any LLM.
That’s a bit like saying human brains do not understand words. They operate on calcium and sodium ion transport.
> Shared slack channel if problems arise? There you go. You wanna learn more? Sure, here are the resources. Workshops? Possible.
> wins by far [...] most importantly community plus the company values.
Like, talking about "You can pay the company for workshops" and "company values" just makes it feel so much like an unsubtle paid-for ad I can't take it seriously.
All the actual details around the vectorDB (for example a single actual performance number, a clear description of the size of dataset or problem) is missing, making this all feel like a very handwavy comparison, and the final conclusion is just so strong, and worded in such a strange way, it feels disingenuous.
I have no way to know if this post is actually genuine, not a piece of stealth advertising, but it hits so many alarm bells in my head that I can't help but ignore its conclusions about every database.
This complete lack of understanding is also why it's completely laughable to think we can do AGI any time soon. Or perhaps ever? The reason for the AI winter cycle is the framing of it, this insane chase of AGI when it's not even defined properly. Instead, we should set out tasks to solve -- we didn't make a better horse when we made cars and locomotives. No one complains these do not provide us with milk to ferment into kumis. The goal was to move faster, not a better horse...
https://aeon.co/essays/your-brain-does-not-process-informati...
But it doesn't mean the results are good.
At the current pace of development, AI will catch-up in a decade or less.
• Current 3.5-family price is $1.5/million tokens
• Was originally $20/million tokens based on this quote: "Developers will pay $0.002 for 1,000 tokens — which amounts to about 750 words — making it 10 times cheaper" - https://web.archive.org/web/20230307060648/https://digiday.c...
(I can't find the original 3.5 API prices even on archive.org, only the Davinci etc. prices, the Davinci model prices were also $20/million).
There's also the observation that computers continue to get more power efficient — it's not as fast as Moore's Law was, doubling every 2.6 years, or a thousand-fold every 26 years, or about 30% per year.
He asked chatgpt to do the math.