The fact that public LLM usage is leveling off at a price of $0 and Jensen "we make the shovels in this gold rush" Huang is rather desperately claiming that you need to spend $250k/year in tokens to be taken seriously suggests that demand saturation may not be that far off.
Whether Jevons' Paradox applies to software engineers I think is another open question. Im constantly being told that it doesnt and that LLMs make half of us redundant now, but Im skeptical - so much automation I see is broken or badly done.
This is the first or second inning in the LLM rollout. It'll take 15-20 more years for full integration of AI agents into the life of the typical person.
The claw experiments for example can just barely be considered alpha stage. They're early AI garbage unfit for the average person to utilize safely. That new world hasn't gotten near the typical person yet.
The compute requirements to get to full integration of AI agents into the life of the average person - billions of them - is far beyond 10x where we're at now.
This is an argument in favor of demand having leveled off.
This doesnt track at all with my experience. Everybody is using it everywhere.
Moreover people are using them for daily life tasks even when it is not an appropriate use of LLMs - e.g. getting medical advice as you referred to or writing emails which are clearly pissing off their coworkers.
In this respect I see it as akin to radium - a new technology that got a little too fashionable for its own good when it first emerged and which will likely have many use cases scaled back.
Personally I experienced this when a specialized doctor believed a drug interaction to be the opposite, thinking A hinders the absorption of B, when actually it hinders the clearance, tripling concentration of B.
Without AI, I would have been clueless about this and could not have spotted the mistake. I don't know if it would truly have been critical, but it did shake my confidence in doctors.
Id be careful stating this is an inappropriate use of LLMs. Im semi tapped in to the medical literature community and there is a lot of serious discussion and research going into the usage of LLMs for medical advice and most of it is showing that LLMs are barely worse than doctors, and much much cheaper/more convenient. They definitely arent ready to completely replace doctors, but it seems they can provide competent medical advice in a pinch. Look out for the literature on this in the coming year, its only the last few months that researchers seem to be taking LLMs seriously.
No one in our Auto shop is using AI. One of the new diagnostic tools was demo'd with AI, and none of us were having it. It's about as accurate as Googling your symptoms.
My mother had an AI powered lung scan that came back with Stage 4 Cancer. The Oncologist got called in (for a fee!) to tell us it was just early stage COPD.
I'm asking 'cos while I'm philosophically opposed to the first option, but I'd love to hear about anything that resembles the second.
This includes encouraging people to set up elaborate multi model set ups (e.g. "gas town") for coding that do not meaningfully improve productivity but which certainly do cause token usage to explode.
It also includes encouraging execs to use token consumption as a proxy for productivity - almost akin to SLOC.
AI has a halo right now and the managerial class seem to be willing to forgive almost any failure because the promise is so enticing. We're at peak expectations right now. They will soon start to be less forgiving when the warts which are intrinsic to LLMs remain unsolved.
As best as I can tell, that's the thinking. It's one number, it's very easy to find and manage, and there is a belief that it directly measures productivity.
I disagree that it does; seems to me the throughput of useful features is a better measure, but I'm not in the drivers seat on this one
Ultimately the performance will be assessed via the income statement and cash flows of customers of the model producers.
Frankly in the window pre-IPO it’s in the best interests of OAI et al to show a line going to the top-right in relation to tokens, in their prospectus. What does that mean?
Strategic manipulation.
The market has achieved it's current saturation level with loss-leader prices that remind me of the Chinese bike share bubble[0]. Once those prices go up to break even levels (let alone profitable levels), the number of people who can afford to pay will go down dramatically (and that's not even accounting for the bubble pop further constricting people's finances).
If all they do is hike prices then they'll lose customers to competitors who don't or who find a way to serve a similar model cheaper.
The demand isn't going to go away purely through higher prices. Once people know something is possible they will demand it whether supply is constrained or not. That's a huge bounty for anyone who can figure out how to service that demand.
Personally, I would have used all those tokens to generate synthetic data for IDA (iterated distillation and amplification) so that the more efficient 1000 token/answer chat model can answer more questions, but apparently that doesn't justify an insane datacenter buildout.
Claude Code and co. can now analyze an enterprise codebase to debug issues in a system with multiple services involved.
I don't see how that would have been possible at all in the past.
The ceiling of token use when everyone has something akin to OpenClaw just running as a background process on their phone is way higher than there’s supply for right now. Jevons paradox is still in full force.
Tne price is very much not $0, even 'free' models have usage capacity limits that equate to a shadow-price.