How the AI Bubble Bursts (opens in new tab)

(martinvol.pe)

372 pointsmartinvol2mo ago523 comments

523 comments

226 comments · 45 top-level

joshstrange2mo ago· 42 in thread

> RAM prices are crashing because new models won’t need as much

Reality begs to differ [0] and following the link for that text goes to an article [1] where they talk about Google's TurboQuant which supposedly will lower the RAM requirements. Now if that means RAM prices come down (as speculated, not reported on, in the link) or the AI companies just do more things with their extra ram is yet to be determined. The fact this article links there with text "RAM prices are crashing" throws the entire rest of the article into doubt for me.

RAM prices are most certainly not crashing (yet) and treating it as a forgone conclusion because _one_ lab found gains could be made and hasn't even reported on the efficiency of their method is just irresponsible. It's almost as bad as when LLMs link things to prove their point, you visit the link, and find it says nothing of the sort or even the opposite.

[0] https://pcpartpicker.com/trends/price/memory/

[1] https://tech.sportskeeda.com/gaming-news/how-google-s-new-tu...

amelius2mo ago

> Now if that means RAM prices come down (as speculated, not reported on, in the link) or the AI companies just do more things with their extra ram is yet to be determined.

I think it is determined:

https://en.wikipedia.org/wiki/Jevons_paradox

woadwarrior012mo ago

Yeah, even if one efficiency trick lands, people will end up spending the saved budget right back on bigger models, and/or more "thinking" tokens.

1 more reply

pydry2mo ago

Jevons paradox only applies if demand hasnt already been saturated.

The fact that public LLM usage is leveling off at a price of $0 and Jensen "we make the shovels in this gold rush" Huang is rather desperately claiming that you need to spend $250k/year in tokens to be taken seriously suggests that demand saturation may not be that far off.

Whether Jevons' Paradox applies to software engineers I think is another open question. Im constantly being told that it doesnt and that LLMs make half of us redundant now, but Im skeptical - so much automation I see is broken or badly done.

8 more replies

fotcorn2mo ago

Also, there is zero reason to think that the big labs did not have anything similar to TurboQuant for a long time already.

The recent blog post from Google announcing TurboQuant does not change anything regarding RAM planning for the big labs.

TurboQuant itself is already a year old! So even smaller labs have probably seen and implemented it.

scw2mo ago

TurboQuant has a specific benefit by compressing the KV cache at a negligible cost to quality. That mainly means that the context lengths can go up in models for the same amount of memory, however the KV cache only accounts for something like 20% of the overall model size, and this will not dramatically decrease memory demands in the way that some of the more sensationalist reporting has stated.

1 more reply

schmidtleonard2mo ago

The open source tooling got quantization support 3 years ago! It was a lesser type of quantization, but more than enough to prove that the savings just go to bigger models.

adjejmxbdjdn2mo ago

I’m not disagreeing with you, but consumer RAM prices are lagging indicators. If commercial RAM prices are dropping then consumers will see those price drops last, especially given the fact that several consumer manufacturers turned to commercial only.

drakythe2mo ago

Is there a source that says commercial RAM prices are dropping? I was recently told (without a source, so I am not sure if it is true or not) that OpenAI never even bought any of the RAM they signed deals on last year, and that those deals were just letters of intent. So if prices are coming down I wouldn't be shocked but the economy is pretty well vibe coded these days so who even knows.

2 more replies

ToucanLoucan2mo ago

If they see them. Plenty of businesses are still charging pandemic prices for all kinds of goods and simply pocketing the difference.

Cars come to mind instantly. Prices exploded in 2020/1, due to legitimate shortages, most of which have been plus or minus resolved, but the prices for new (and used!) cars never came back down.

2 more replies

eru2mo ago

Why would they be lagging?

slfnflctd2mo ago

> almost as bad as when LLMs link things to prove their point, you visit the link, and find it says nothing of the sort or even the opposite

To be fair, they got it from us. This happened to me plenty of times long before modern LLMs.

throwup2382mo ago

It learned by reading HackerNews, after all.

ajross2mo ago

> Reality begs to differ

Honestly you're both wrong. RAM prices spiked speculatively, and they're going down for the same reason. Market people always want to argue in fundamentals, when in practice *ALL* the high frequency components of the signal are down to a bunch of traders trying to guess where it's going in the short term.

At best those guesses are informed by ground truth ("AI needs a lot of RAM!" "Sam cornered the marked!" "TurboQuant needs less RAM!"), but they remain guesses, and even then you can't tell the difference between that and random motion.

T-A2mo ago

> RAM prices spiked speculatively, and they're going down for the same reason.

https://pcpartpicker.com/trends/price/memory/

Note how flat the black lines are.

Then note how wide the gray bands are. That makes it very easy to cherry-pick a few examples to present as "supporting evidence" that prices are doing whatever you want to believe they are doing.

1 more reply

Forgeties792mo ago

I’ll believe they’re going down when it doesn’t cost $550 for the $105 ram I purchased 1 year ago. Yes consumer prices lag commercial prices yada yada, I think any hot takes are pointless until we see lower prices or far more convincing evidence it’s coming. When it costs basically a MacBook neo for 32gb of DDR5 ram it’s hard to hear “ram is coming down for sure”

cma2mo ago

> RAM prices spiked speculatively

Didn't OpenAI buy up 40% of the capacity all at once?

1 more reply

h14h2mo ago

I do wonder how closely prices consumer RAM kits follow the wholesale prices for NAND chips manufacturers see internally. The pcpartpicker graphs you linked show consumer prices have leveled out and may even be starting to fall. Depending on how the economics shake out this could mean we've hit an inflection point.

My personal prediction is that once the VC bill comes due and prices for frontier models starts to climb, competition for efficiency will heat up. The main AI use-cases seem to be falling into buckets, and I doubt serving gigantic, do-it-all general models for every use-case under the sun is remotely cost-effective.

If common use-cases start to be more efficiently served by smaller, more efficient purpose-built models (or systems thereof), it'd make the big frontier models increasingly niche. Cursor's Composer 2 model is a great example of this.

In any case, I think it's pretty fair to speculate we may be seeing RAM prices start falling sooner rather than later.

joshstrange2mo ago

Consumer vs NAND is an absolutely fair distinction to make, I'm not sure how to track those prices. My main issue the article saying "RAM prices are crashing" (which I can't find any evidence of) and linking to an article that doesn't even repeat that claim, it instead just speculates that maybe RAM will come down in price due to this new idea.

> In any case, I think it's pretty fair to speculate we may be seeing RAM prices start falling sooner rather than later.

I sure hope so. RAM, HDDs, and SSDs are all crazy-high right now and I was in the market for literally all 3 but have paused all my buying because I can't justify the costs as they stand today.

1 more reply

layer82mo ago

I agree. The article they link to talks about memory company stocks crashing, not RAM prices crashing. There is some truth to the former: https://www.ft.com/content/e4e15692-187e-4466-832e-ec267e792...

martinvolOP2mo ago

RAM prices haven't crashed yet and it'll take time because it has to propagate within the supply chain. Micron is -20% from the top already https://www.investing.com/equities/micron-tech

Stock price is the best forward indicator I can think of

cwillu2mo ago

That might be true, but it's still straightforwardly wrong to say that RAM prices have crashed, and it calls into question everything else they write.

1 more reply

albinn2mo ago

I would think that we are going to see RAM prices increase even more, given, among other things, pure helium disruptions and increased electricity prices.

I haven't looked closely into TurboQuant, but perhaps it will revolutionize just as much as the 1-bit llm did...

aurareturn2mo ago

Even if TurboQuant, which was released a year ago, drastically lower RAM requirements, AI labs will just release bigger models.

Jevons Paradox. When are we going to learn that efficiency gains in AI does not decrease hardware usage?

functional_dev2mo ago

valid point, it reminds me of video games. GPUs got faster, devs pushed higher resolutions, more complex lighting instead of saving power :)

gmerc2mo ago

consumer ram is starved by production capacity shifting to HBMs. Hbms dropping in price would not affect consumer RAM on any immediate timeline. Also, as pointed out by many, Jevons Paradox

am17an2mo ago

Thank you, there are two things I would like to point out:

1) Google releasing something probably means they don't see it as important. 4-bit KV-cache quantization has been known for a long time. The fact there is almost a mass hysteria about this paper makes me think there is a lack of skepticism in this AI mania, even in relatively tech-savvy crowd.

2) But prices for memory companies are crashing! look around, the whole market is crashing.

sandworm1012mo ago

There is also demand for ram in others areas of data centers. As we are all pushed deeper into clouds, i can see the rise of ram for data storage (ram drives) continue to eat into the supply. A module of ddr5 will be more useful in a netflix rack streaming movies 24/7 than in a gaming PC where it may only be used an hour or two every day.

veunes2mo ago

Bingo. Even if some magic drops tomorrow that compresses the KV cache down to literally zero bits, that saved VRAM will instantly get swallowed up by bumping the batch size or pushing the context window to 10 million tokens. There is no such thing as "excess memory" in ML, only under-trained models

maeln2mo ago

> > RAM prices are crashing because new models won’t need as much

> Reality begs to differ [0] and following the link for that text goes to an article [1] where they talk about Google's TurboQuant which supposedly will lower the RAM requirements. Now if that means RAM prices come down (as speculated, not reported on, in the link) or the AI companies just do more things with their extra ram is yet to be determined. The fact this article links there with text "RAM prices are crashing" throws the entire rest of the article into doubt for me.

I find it fascinating how extremely reactive things have become. One research paper which, to my knowledge, hasn't been externally replicated yet, nor implemented, generate tons of hyperbolic article, tweets and such, and do actually manage to move the market at least temporarily. Not just this, but a simple message in full caps lock by the president of the U.S who is in the habit of lying through is teeth constantly, and the same thing happens. It's like there is a big bubble that threw any form of critical thinking out of the window and is in a hurry to react to anything even if it is not even remotely believable. Now I understand why it happens, there is a lot of money that can be made by capitalizing on FOMO, either by driving traffic to their website, socials, etc, or by simply insider trading (which feels like it has been legalized these days). But I still find it incredible the proportion it started to take.

JCTheDenthog2mo ago

My favorite was when Google revealed Project Genie a month ago (which lets you generate video game worlds with AI, basically) and stocks for game companies immediately dropped. Anyone familiar with games and gaming knows that what Project Genie offers (essentially empty worlds with minimal interactivity that you can just kind of wander around in, and they struggle with simple things like object permanence if you look away) knows that this isn't real competition for actual games, but the markets reacted anyways.

1 more reply

incognition2mo ago

You nailed it. It's algos and noise trading

faangguyindia2mo ago

If the gains are real why the limits are so bad? Google can barely serve Anti-gravity.

BoredPositron2mo ago

You get more Claude tokens from a Google subscriptions via antigravity than from anthropic. Especially if you use the 5 other "family" accounts you can share the subscription with...

owlmirror2mo ago

Isn't that at the moment still a free product? Of course they will not prioritize serving those requests. That tells you nothing.

3 more replies

tracker12mo ago

Even worse, 3 memory companies control well over 90% of the international market, with a history of cartel collaboration that's going to be ever harder to prove with fewer companies.

hintymad2mo ago

Some also argue that the RAM price keeps rising because of the bullwhip effect. I was wondering if there's anyway for us to differentiate a sustained demand from the bullwhip effect.

Valakas_2mo ago

This article and the title are total clickbait filled with emotional hooks. And it worked. You totally debunked it but look how it still became so popular.

hirako20002mo ago

Not crashing yet. The article is looking 1 to 5 years to come.

Given Nvidia's CEO's agitation I would give credit to the prediction, and if it's correct the price will go back to what it was, or even lower of investment in capacity are made today.

michaelcampbell2mo ago

My take is new capabilities will consume any price reductions, making them moot. At least in the medium term.

A RAM price drop due to some magic efficiencies assumes everything else doesn't change, which I doubt anyone honestly thinks will be the case.

sigmoid102mo ago

Yeah, I also stopped reading at that point. If I want a bunch of random, made up facts to sell lukewarm opinions or steer the uneducated masses, I'll tune in on a Trump press conference. Why does this feel like someone is desperately trying to make reality mirror his flailing market bets?

Forgeties792mo ago

Sometimes it's real easy to see who has risky short positions right?

mNovak2mo ago

This feels similar to when Deepseek first debuted with claims of ultra-low cost training, and all the pundits exclaimed that Nvidia was finished, the bubble had burst, etc.

infecto2mo ago· 39 in thread

It’s incredible how polarizing the AI rush is. I keep the perspective that the technology is an absolute step change but I have no idea where the cards will fall. I take a lot of issue with these style of articles. I get a sense that the authors are being overly defensive.

The cost to serve tokens is absolutely profitable today and that’s been true for at least a year. What’s unclear is how R&D and capex fit into the picture. I am not that pessimistic on this front either though. For the data center build outs, demand for tokens is still exceeding supply. On the R&D front, well most of us here on HN have benefited from decades of overinflated engineering salaries being paid by often companies that were not profitable and not only unprofitable, usually without a plan for success. In this current rush, companies cannot keep up with supply, it’s a much easier math problem when you have something that people want (tokens) and you need to figure out profitability when including R&D.

Aperocky2mo ago

Demand of tokens is absolutely skyrocketing.

And unlike the traditional "this will replace humans right away", I think what this introduce is a lot of incentive to spread those token in places where there was never any incentive to hire a software engineer for previously. In turn, that will drive a lot of business activity in those area that will potentially fail given the current quality of the output.

This feels like a boom before bust scenario, and I'm not even sure if it will bust.

skeeter20202mo ago

Maybe we need to focus on a better definition of "bust" but we will surely see something along the lines of the hype-cycle graph in AI; what technology has not fallen into the trough before (best case) reaching a more steady-state of use and growth?

2 more replies

WarmWash2mo ago

>potentially fail given the current quality of the output.

The question is how big the fail is if you measure it in 3 month increments going back to late 2022.

1 more reply

hirako20002mo ago

Tulips sales also skyrocketed.

Seriously, what value are tokens providing other than justifying layoffs. Concretely. Today. Not in the speculating scenario that cardiologist could be replaced with models.

We see this new trend of agentic coding, again a promise software will be written that way going forward, despite the number of fiasco already experienced when trusting a model turned bad. The use case may provide value, but right now all it does is fullfil the push for token consumption all these AI leaders are advocating for.

11 more replies

gdilla2mo ago

the busting will come from the token consumers. so many disasters waiting to happen.

boriskourt2mo ago

> The cost to serve tokens is absolutely profitable today and that’s been true for at least a year.

> For the data center build outs, demand for tokens is still exceeding supply.

Can you provide any numbers for this?

wongarsu2mo ago

I can get Kimi K2.5 inference on openrouter for about $0.5/MTok input + $2.5/MTok output, from six providers that have no moat besides efficiently selling GPU time. We can assume they are doing so at a profit (they have no incentive to do this at a loss), giving us those numbers as the cost to serve a 1T-a32b model at scale.

Now we don't know the true size of any of the proprietary models, but my educated guess is that Sonnet is in about the same parameter range, just with better training and much better fine tuning and RLHF. Yet API pricing for Sonnet is $3/MTok input + $15/MTok output, exactly six times as expensive. Even Haiku is twice as expensive as Kimi K2.5.

I find it difficult to believe in a world where those API prices aren't profitable. For subscription pricing it's harder to tell. We hear about those that get insane value out of their subscription, but there has to be a large mass who never reaches their limits. With company-wide rollouts there might even be a lot of subscription users who consume virtually no tokens at all.

5 more replies

ACCount372mo ago

Check the token prices for open weight LLMs at various independent inference providers.

That gives you a very good estimate of "how much can you serve the tokens of a model of the size N for while making a profit".

Now, keep in mind: Kimi K2.5 is 1T MoE. Today's frontier LLMs are in the 1T to 5T range, also MoE. Make an estimate. Compare that estimate with the actual frontier lab prices.

1 more reply

bob10292mo ago

https://www.cerebras.ai/blog/cerebras-cs-3-vs-nvidia-dgx-b20...

paulddraper2mo ago

Anthropic has said inference is profitable. That’s a biased source, but the math pencils.

This is why switching to local open weight models saves a lot of money. (Even though it’s not apples to apples.)

2 more replies

infecto2mo ago

Most/all private labs have cited inference is profitable. This was happening before the large push to scrap plans and largely charge folks the underlying api rates. Second take a look at the pricing of open models. Now certainly it’s not direct 1-1 comparison but we can use it as a baseline. Now of course folks might not be telling the truth but one of those situations where I see too many markers on the true side.

For supply look at outages and growth rates at companies like openrouter. The demand is growing every week.

iterateoften2mo ago

According to open router token demand is growing at something like 10% a week

It’s insane

infecto2mo ago

I wish this was higher up. I have been tracking the same since Thanksgiving ‘25 and the growth is unreal. Again I don’t know where the cards fall maybe the industry overspent on capex but it’s at least easier to see why they are spending based on demand. The risk of being left out is greater than overbuilding.

1 more reply

chrisweekly2mo ago

> "decades of overinflated engineering salaries"

'Overinflated' relative to what? You make some good points but I don't accept this as a premise.

schmidtleonard2mo ago

Overinflated relative to the wet dreams of the ownership class.

1 more reply

fcarraldo2mo ago

Well, not GP, but I do. Let’s look at the numbers:

Median senior SWE salaries in SF: https://www.levels.fyi/t/software-engineer/levels/senior/loc...

Median income in metro areas: https://www.cnbc.com/2024/07/11/the-median-salary-for-the-25...

Engineering salaries are significantly higher than nearly every other industry on average and on median. Much of this is driven by VC funding rather than sound, profitable, bootstrapped businesses with sustainable profit margins.

Engineering salaries have also been driven upwards significantly the past ~10 years (since the post-2008 crash recovery), while wage growth in the US is mostly stagnant. I don’t have a source handy for that, but there are plentiful studies.

Outside of the US this may be less true, but I took GP’s “most of us on HN” to mean people who work in US tech companies which are primarily concentrated in high COI areas.

3 more replies

malfist2mo ago

> The cost to serve tokens is absolutely profitable today

How can you possibly say that? Everyone knows that's not the case, these companies are losing money every day selling tokens. Revenue is not the same thing as profit.

dist-epoch2mo ago

There are private companies which rent/buy GPUs, run open-weight LLMs on them and sell the tokens. They absolutely make profit, and their clients think they get a good deal and are buying the tokens.

infecto2mo ago

Don’t confuse what I say. Bottom line these companies are not profitable yet but it is profitable to serve a token via the API. They have increasing demand, not enough supply, models are getting better on quick timelines. For sure there may be some losers but it’s not hard to see that that token serving can be a profitable activity.

1 more reply

naravara2mo ago

I think they’re losing money because they have to amortize the costs of training the models in the first place, which is where most of the resource sink is.

This is why they were freaking out about DeepSeek just taking the trained model weights and slapping an interface on it.

1 more reply

jeromegv2mo ago

Yep, especially if we look at what happened just last week, both Google and Anthropic have dropped how much you get out of your existing plans.

2 more replies

Tade02mo ago

My main worry is - once this is all over, the market consolidates and using LLMs will become a requirement in job listings, what's the highest price per million tokens companies will be able to charge us?

Currently on a given day I'm chewing through approximately the equivalent of my lunch money, but where there's opportunity to extract wealth, someone will find a way to do it.

h14h2mo ago

My (potentially naive) take is that open models will save us. The biggest markets for LLMs (e.g. coding) are narrow-enough to be served well by smaller models with proper RL. Cursor's Composer 2 (created from a Kimi K2.5 base) is a great example, and I expect it to be the first of many.

The wealth of great open models provide an excellent base for fine-tuning, distillation, and RL. I see a lot of untapped potential in the field of bespoke, purpose-built models that can be served far more cheaply than the frontier competition. I would not be surprised if we see frontier-adjacent experiences running comfortably on a Mac Mini by year end.

With frontier models seemingly hitting diminishing returns in quality, I struggle to see a world in which gigantic, expensive, general-purpose models don't become increasingly niche.

bluedays2mo ago

It's already a job requirement for a bunch of places, they're just not listing it. I lost out on a job recently because I haven't used cursor ai.

1 more reply

dist-epoch2mo ago

Jensen is already talking about $1000/mil tokens soon.

But there is no real higher limit. Imagine a LLM which could answer the question "what does my company need to do to beat the competition?". And then realize that the competition asks their LLM the same question. So now everybody is bidding the price up or using more tokens to get a better answer

2 more replies

thereitgoes4562mo ago

> The cost to serve tokens is absolutely profitable

Can you explain why you know better than the analyst at Cursor cited in this article?

iterateoften2mo ago

Open router is an upper bound of compute cost for the open source models. So people assume that opus and sonnet really isn’t sucking up 10x the resources because open source models aren’t 10x worse. Idk if it’s true or not, but haiku is $5/m tokens and it is much worse than the $2-3/mt models imo

1 more reply

infecto2mo ago

Can you cite your source of an analyst at Cursor. I read the article and looking through the boatload of links but struggled to find what you are referring to. Ty

noelsusman2mo ago

That analyst was talking about subsidizing tokens through the subscription plans, which is a different claim.

1 more reply

SirensOfTitan2mo ago

This is a classic HN mistaking the map for the territory. R&D and capex absolutely figure into de-facto profitability and sustainability for AI labs, despite their separate treatment in accounting.

> well most of us here on HN have benefited from decades of overinflated engineering salaries being paid by often companies that were not profitable and not only unprofitable

This is a really concerning perspective: people were paid what they were worth. Software is or was one of the few remaining arenas wherein a person can find a middle or upper middle class lifestyle consistently.

I will also note: a startup raising an 8 MM series A and eventually fizzling out is not the same at the hundreds of billions invested in these AI companies without a path to profitability. It is utterly absurd to pretend these are the same thing: any company ingesting that much cash needs to justify its capacity to survive.

ForHackernews2mo ago

> any company ingesting that much cash needs to justify its capacity to survive.

What, why? There are tons of low-margin capex-intensive business out there.

I think AI will end up like being like hosting. All the models will converge to being pretty-decent and the companies will have to compete on efficiency since they are selling a generic commodity.

You can already see Anthropic fears this scenario since they try so hard to make people use their first-party tools rather than plugging Claude in as a generic part of a third-party stack.

LLM hosting is the next VPS.

1 more reply

guzfip2mo ago

> Software is or was one of the few remaining arenas wherein a person can find a consistently.

I want to add something additional to this: it is one of the few fields that can afford middle or upper middle class lifestyle and is accessible.

I have no doubt if I could redo my life with the necessary resources I’d be more than capable of putting myself through med school and gone with a secure career that paid more than I ever made in software.

But at this stage of life? I don’t have the time or money to spend a decade+ paying some institution tens of thousands of dollars to hopefully maybe have a real career.

Once software as a career dies, I suspect many will find themselves locked out the middle class for generations.

2 more replies

fcarraldo2mo ago

> Software is or was one of the few remaining arenas wherein a person can find a consistently

Software salary inflation and expansion has made this the case. Tech’s accessibility to the educated has accelerated gentrification massively, rising up prices on rent and food. While the statement is correct, tech’s contribution to income inequality is part of the issue. If you’ve lived in Austin or Chicago (especially Austin) prior to ~2010 you’ll have seen this first hand.

1 more reply

keybored2mo ago

> This is a really concerning perspective: people were paid what they were worth.

Even interpreting what-they-were-worth in the usual sense, I’m not so sure about this. We have seen wage collusion reported by the usual US West Coast-based companies. And some news on here[1] have reported that some engineer with a salary of $100K[2] might be producing $1M of value. And even factoring in the usual “but benefits and overhead” comes out to a solid factor of profit per programmer/engineer.

Despite that the sense I get (only from this site since that is my only reference) is that the so-called overpaid engineers are incredibly content to just have this happen to them. As long as they are paid well compared to other workers, it’s fine. No matter the profit factor. In fact, the discourse is very much focused on how “privileged” they were if the tide ever changes. Instead of realizing how much value they provided, collectively.

Outlets for capturing more of the value they create is entrepreneurship (Hello HN). Never any collective organizing. And entrepenurship is easily bought via aqcuisition.

Collective bargaining would have been relevant in case they ever get automated... by the very software they co-created.

One could imagine that this “privileged” collection of programmers could have served as a vanguard for the collective good of programming professionals as well as collective ownership of software goods, using their privilege to that end. The former never happened, and the latter is partly realized in people’s free time (see the OSS maintainer in Nebraska meme).[3]

[1] All from recollection since this is just news from the Frontier to me

[2] Of course the pay might be much higher now; this might have been a while ago

[3] when it isn’t simply exploited by corporations just using OSS without giving any back; a logical turn of events when no license or law forces them to contribute back

2 more replies

9rx2mo ago

> This is a really concerning perspective: people were paid what they were worth.

The parent comment doesn't discount that, only pointing out that "what they were worth" was inflated due to a speculative environment. Wherein lies your concern?

2 more replies

infecto2mo ago

Oh come on there are no “classic HN mistakes” here. Inference is profitable but bottom line is not yet. This is a very young industry and unlike those of the past, it’s much easier to picture a possibility of profitability. It’s absolutely different in that the marginal cost scales linear but solving for the R&D portion of a product where supply cannot keep up is a lot easier than some SaaS where the underlying product is not being used.

The salary jab was probably a little harsh.

Your ending is a bit of a fizzle too. There are many capex intense businesses that do just fine.

nickphx2mo ago

step change? how? profitable? where did you read that? people want tokens? really? who are these people?

elzbardico2mo ago

Yeah, if we just ignore R&D, fixed costs, depreciation, and the fact that there's a high likelyhood investor were expecting a return, yeah, ignoring all of that, and trusting their number we may say inference turns a profit.

In accounting, almost anything you want can be true, at least for some time.

Eridrus2mo ago

The article is just helpfully illustrating how artisanal you can make your slop if you really try!

Aurornis2mo ago· 18 in thread

This article tries to build upon a lot of half-truths or incorrect facts, like this:

> OpenAI is struggling to monetize. They turned to showing ads in ChatGPT,

The ads aren’t going into your paid plans (except maybe a highly discounted tier, depending on the market). The ads are a play to offer a free version. Having an ad-supported free tier isn’t new.

The discussion about being unprofitable also repeats the reductionist view that these companies are losing money and therefore the business model doesn’t work. This happens with every VC cycle where writers don’t understand that funded companies are supposed to lose money while they grow. That’s what the investment money is for.

We have very strong indicators that inference is not a money loser for these companies and is likely very profitable. They should be spending large amounts of money on R&D to get ahead and try new things while they’re serving up tokens.

The “but they’re losing money” argument never seems to be brought out against competitors that literally give away their models for free and for which we can calculate the cost of serving 400B-1T parameter open weight models.

Izkata2mo ago

> The ads aren’t going into your paid plans (except maybe a highly discounted tier, depending on the market). The ads are a play to offer a free version. Having an ad-supported free tier isn’t new.

Sounds like it is new for ChatGPT though. That's also how it started with TV and Youtube, first on the free tier then expanding to the paid ones.

smt882mo ago

YouTube, Spotify, and most video steamers have zero ads on paid tiers. I never see video ads.

3 more replies

carlosjobim2mo ago

YouTube doesn't show ads on the paid plan. If you're talking about sponsored segments those would be impossible to moderate, and YouTube does offer easy skipping of those.

krferriter2mo ago

I've never had ads in my Youtube Premium

butlike2mo ago

This statement doesn't discount the original statement: that ads are going into GPT, which Sam called a last resort.

> The discussion about being unprofitable also repeats the reductionist view that these companies are losing money and therefore the business model doesn’t work. This happens with every VC cycle where writers don’t understand that funded companies are supposed to lose money while they grow. That’s what the investment money is for.

Usually propped-up companies don't last in the long term once the VC subsidy runs out. There's a difference between getting VC money in order to buy rocket parts, and getting VC money in order to charge $7 when you would really need to charge $10. The latter problem never goes away.

throwaway274482mo ago

> We have very strong indicators that inference is not a money loser for these companies and is likely very profitable.

Why is OpenAI specifically losing money hand over fist then?

aurareturn2mo ago

Training. But training costs are a smaller and smaller percentage of revenue as inference revenue grows faster than training costs.

1 more reply

gruez2mo ago

>The “but they’re losing money” argument never seems to be brought out against competitors that literally give away their models for free and for which we can calculate the cost of serving 400B-1T parameter open weight models.

To be fair people aren't exactly bullish on the prospects of deepseek or z.ai either, it's just they're below radar so they don't get mentioned.

Kye2mo ago

Z.ai is at least owned by a public company, so there might be something in the financials.

https://en.wikipedia.org/wiki/Z.ai

>> "On 8 January 2026, Z.ai held its initial public offering on the Hong Kong Stock Exchange to become a listed company.[24][25][26] It is considered to be China's first major LLM company that went through an IPO.[26] In February 2026, JPMorgan Chase recommended to investors of purchasing stocks of the company alongside MiniMax.[27]"

https://www.zhipuai.cn/investor_relations/

But I haven't looked into it.

141132mo ago

> companies are supposed to lose money while they grow

At what point do we declare that a company has "grown" and now must make money? OpenAI is a multi-billion dollar company right now, surely that's a point at which they should be profitable, instead of propped up by further investment and borrowing.

> We have very strong indicators that inference is not a money loser for these companies

All of the economic analysis that I've read strongly states the opposite. Running a GPU is a net loss /even for the data centre operators/. For them to break even, they currently charge OpenAI/Anthropic/Etc more than OpenAI/Anthropic/Etc make per-token.

butlike2mo ago

They clearly have some vested interest/skin in the game. Not sure it's worth retorting that one.

monegator2mo ago

> Having an ad-supported free tier isn’t new having ads shoved in paid tiers isn't new either

raincole2mo ago

And it usually doesn't result in a market crash.

project2501a2mo ago

> This article tries to build upon a lot of half-truths or incorrect facts, like this:

yeah i was wondering why my bullshit detector was going off. This feels as if someone who cooks for Ramsey's kitchen is trying to predict the end of the market hike.

Mentlo2mo ago

We have strong indicators that inference is profitable on non-economically-valuable prompts. We don't have strong indicators that inference is profitable on economically valuable prompts.

As AI companies start extracting rent from the prompting, one of two things are going to collapse - either the long tail revenue base of low-value inference is going to collapse, because people won't be using Chat GPT to get a recipe if it costs them money or if it is ad-ridden; or the cost of economically-valuable inference is going to go up - and whether it goes up to economically stable positions is a toss-up.

And I say this as an AI enthusiast with <50% probability of a bubble burst in the short term.

mcv2mo ago

I've heard "They're losing money" since the 1990s. About Amazon and nearly every other tech company.

The strategy is always:

* Build something useful

* Give it away for free to get people exited

* Convince investors that this is going to rule the world

* Grow to dominate the world

* Enshittify

danaris2mo ago

I don't know about others, but with Amazon specifically, it's always been very clear that their "losing money" in aggregate was purely on paper, for tax purposes: their ability to undercut everyone else was initially based on being online without the brick-and-mortar costs that other stores did, then on economies of scale, and now on being the 900kg juggernaut that just has more money than God and can blow it on running you out of business if they feel like it.

arctic-true2mo ago

The difference is switching costs and the viability of alternatives. Even open source models are only a few months behind the frontier labs, which is a long time in tech but practically no time at all in the eyes of a business consumer. At best, one of the frontier labs will survive and get to flex its hegemonic muscle. But billions and billions of dollars worth of investments still get wiped out in that scenario, which I would still qualify as the bubble popping. This would be doubly true if the winner winds up being Google or Microsoft.

nopinsight2mo ago· 16 in thread

> nobody is sure if even their metered pricing is profitable

This is most likely wrong. Lab executives insist that serving tokens is profitable. It's the cost of training next-gen models that requires them to keep raising ever larger rounds. More importantly, many independent providers price tokens of open-weight models at a fraction of Anthropic's prices.

atwrk2mo ago

But are they actually profitable, or do they employ creative accounting where only parts of overhead expenses are counted against all of inference revenue, similar to what Uber did?

OpenAI's numbers show that they definitely are not profitable on inference, and even worse, revenue growth scaled linearly with inference cost from 2024 to 2025, which means they can't outgrow this problem. See https://www.wheresyoured.at/oai_docs/

therealdrag02mo ago

Does it matter if it’s creative accounting? Uber is a great example of a company that everyone was certain would fail because it was unprofitable and now it succeeded and is profitable.

1 more reply

baq2mo ago

If they shut down all training today they’d be absolutely printing money for the next couple quarters and then die with a bang once the other lab releases the next frontier to the public.

4 more replies

martinald2mo ago

Yes I wrote a detailed article about this Forbes claim. https://martinalderson.com/posts/no-it-doesnt-cost-anthropic...

Key points - if you compare it to openrouter costs for ~similar sized models it is ~90% gross margin.

And this claim came from Cursor - not Anthropic!

shafyy2mo ago

Not counting training models as part of your gross margin is just creative accounting. It's an inherent part of being able to provde the service for OpenAI, Anthropic etc.

Even so, their subscriptions are significantly cheaper than the token pricing via API. So at some point they will need to get rid of subscriptions or increase the subscription prices dramatically... And that's assuming their current token pricing is actually profitable. Which it probably isn't.

Lastly, I would not trust one word that comes out of an executive of an AI company (or any other large company, for that matter).

mrbungie2mo ago

> Lab executives insist that serving tokens is profitable.

Maybe marginally profitable, but right now they need to give out subsidies for people to use their products (Antigravity, Codex, Claude Code et al) in an actually useful manner that prevents churn and at the scale they need to justify usage growth forecasts, which they need to keep the wheel turning.

Probably if you look at the users who exclusively use the simple chat box interfaces (i.e. ChatGPT, Gemini in UI, Claude in UI) plans it is actually profitable, but I'd also say that's not where most of the usage comes from.

I'd love to actually look at both usage + profitability from each user segment to see if their PxQ growth expectations from non-enterprise usage make any sense.

> Many independent providers price tokens of open-weight models at a fraction of Anthropic's prices.

Are those open-weight models as good as Anthropic? Are they the same parameter class?

est312mo ago

It's a loss leader but this is normal. Same has happened with Uber, Airbnb, Amazon, etc. Using VC money to buy marketshare and once you have it, you can milk it.

The question is more around the moats that these companies have and it seems to me while their models are amazing technology, they don't really have a moat. The open/chinese models still continuously catch up to the american ones.

1 more reply

zozbot2342mo ago

> Are those open-weight models as good as Anthropic? Are they the same parameter class?

Are they as good as Anthropic was one year ago? That's more like it. They don't have to be just as good, they just need to be the most worthwhile for the price. If frontier models are only providing a negligible advantage for what they charge, that absolutely matters.

techpression2mo ago

I wouldn't trust those claims from any private companies, even public ones play the most insane tricks in earnings calls to inflate numbers or heck, just make up new ones.

I'm not saying they're wrong, but I don't take much stock in their words.

sunaurus2mo ago

The point is that you can’t just serve tokens without also training the next models. It’s an inseparable part of your costs, so naturally you can’t be profitable unless the price you are charging ALSO covers training.

dash22mo ago

Is that right? I think that you can serve tokens without training the next models. It would be bad strategy, but it would work. So it's an important question, are they covering their operating expenditure? If they are the business has legs (and it will be worth spending a lot to train the next models). If not, maybe not.

3 more replies

gedy2mo ago

Buying and driving a new car off the lot costs the manufacturer nothing at that moment, but what happens before that is important to account for.

phantom7842mo ago

Do tokens just cover ongoing operating costs, or are they also able to pay back the cost of training that model originally?

pier252mo ago

So these companies will be profitable if training stops? Is that even a real possibility?

danaris2mo ago

Any given company could stop training tomorrow, and, as some others have said here, they'd be generating quite a bit of profit until their models visibly fell behind, however long that ended up taking, at which point they'd probably just fall over completely.

Over the whole industry? No; they can never, ever stop training, or they'll cease to be useful at all very soon.

Training is what keeps the models up-to-date on current events, which includes new programming languages, frameworks, and techniques. It's already been observed that using LLM assistance on some types of programming is much more effective than others, based on how well-represented they are in the training data: if everyone stopped training tomorrow, and next month a new programming language came out, none of them would ever be able to help you program in that new language.

This can be extended to other aspects of programming, too. If training stopped, coding assistants would gradually start giving you wrong answers on how to implement code for APIs, frameworks, and languages that continued to evolve, as they will always do, in much subtler (and likely harder-to-debug) ways than how they'd deal with a new language whose existence they don't even know about.

naravara2mo ago

The impetus to continue training at the pace they are is driven by the competition. So if the money starts drying up, then they’ll naturally slow down because they’ll have to figure out how to do more with less.

I suspect that once the models hit a point of “good enough” for certain use cases companies will start putting R&D focus in other areas that may be less expensive. Like figuring out how to run more efficiently, UI/UX conventions that help users get what they’re trying to accomplish in fewer steps, various kinds of caching of requests, etc. So the cost to serve tokens over time should only come down, and will probably start coming down more rapidly as the returns to model training slow down.

That’ll probably be a while though, because each successive model tends to be a lot better than the last.

3 more replies

piker2mo ago· 11 in thread

> They lose a big customer for their cloud services. Even worse considering that now, using the AI they helped fund, everyone can compete with their sub-par products. GitHub is a good candidate for disruption, and that’d be just the start.

Look, I'm a Microsoft hater like the rest of us, but calling Microsoft's products sub-par discredits the author a good bit. I invite anyone who thinks this to try and compete with them. Go after something like Word, for example. Then prepare to be awed by what some of the most brilliant programming minds ever can produce after grinding for four decades.

hbn2mo ago

If I saw a helicopter crashed into a tree, I don't have to be a helicopter pilot to know it's not an ideal state of a helicopter and something/some people failed.

When I'm using MS Word and it takes 20 seconds to cold launch on a machine that's magnitudes faster than any computers 25 years ago where it launched near instantly, I can tell something is going wrong. When all of their software is harassing me to use AI in ways I don't want to use it, I can tell something is going wrong.

s1artibartfast2mo ago

your comment sums up the conflict.

I dont know if you noticed, but there was a shifting of the goal post from "sub-par" to something wrong/sub-optimal.

The best helicopter you can buy may in fact crash into trees sometimes.

1 more reply

Aperocky2mo ago

Sub par is not the right word, the right word is feature creep.

markdown have much less of that brilliance and thankfully I also needed none of it.

Last time I authored a word document is probably 2 years ago for a government interaction.

karolist2mo ago

You can have an opinion about a tool as a user, without ever having ability to create such a tool yourself, that's literally what every tech and auto reviewer does.

piker2mo ago

Sure, and the less you understand about the tool’s fundamental capabilities, the less useful your opinion is. The best reviewers have deep knowledge.

2 more replies

curtisblaine2mo ago

I'm sure Word is full of arcane backwards compatible tricks that 20% of users use, but I find it hard to differentiate the Pareto 80% of the product from Google Docs or any other competitor (LibreOffice?) Adding rich text, tables, headings and colors is pretty much a solved problem for all of these softwares. Adding images or handling more complex layouts sucks everywhere, it's not like that Word has a great user experience and the other don't. All of them are bad. IMHO, if we had any of the competitors being the de-facto standard for word processing, the vast majority of users wouldn't feel the difference. Power users would for sure, but I'm not sure they're many or they use existential features. If Word didn't have a near monopoly in office settings due to aggressive marketing, OS presence and a proprietary file format that constantly changes and never renders well outside of Microsoft products, it could disappear without anyone (save Microsoft) losing much.

piker2mo ago

Yes. That 80% you find useful is served fine by Google Docs, but there’s a good reason the enterprise overwhelmingly goes for Word, and it lives deep in that 20% and a lot of the time has zero overlap with others.

red_admiral2mo ago

Microsoft's AI, on the other hand, is underwhelming at the moment and might well go the way of Windows Phone. Plus enough people hate the copilot icons everywhere that Microsoft is hinting at dialing down a bit.

MS Office should last a while if they stop calling it "Copilot 365 Office" or whatever it was.

tapoxi2mo ago

The state of GitHub and Windows 11 certainly qualify as sub-par.

sooperserieous2mo ago

I think Github represents 'par'. Plenty of stuff worse and plenty of stuff better. Overall it's what most people expect a coding social media site to be because it set those expectations. Those of us who are only looking for code management (including issues/PRs/etc) are easily satisfied elsewhere.

1 more reply

piker2mo ago

There are some frustrating parts, but subpar is an odd way to describe GitHub to me. I’m pretty happy with what they’re doing, and find the UX super helpful. I do agree Actions needs a debug mode but otherwise I get a ton of value out of the service for $20/month?

1 more reply

schnitzelstoat2mo ago· 8 in thread

It's a winner-takes-all market and everyone wants to be the next Google and not the next Lycos or AskJeeves etc.

It'd be interesting to see what they spend all the money on though as we seem to be hitting diminishing returns and I'm not sure if the typical enterprise user really cares about small improvements on benchmarks.

It seems like it'd probably be better to spend all that on marketing, free trials, exclusivity/bundle deals etc. ChatGPT already has a strong advantage there as it has so much brand recognition. I've seen lay people refer to all LLM's as ChatGPT like my grandparents did with Nintendo and all video game consoles.

joefourier2mo ago

It’s absolutely not winner take all. LLMs have become a commodity and the cost of switching models is essentially nil.

Even if ChatGPT has brand recognition amongst lay people, your grandparents aren’t the ones shelling out $200/mo for a Claude code subscription and paying for extra Opus tokens on top of that. Anthropic’s revenue is now neck and neck with OpenAI, but if tomorrow they increased the price of Opus by 5x without increasing its capabilities, many would switch to Gemini, GPT 5.4, Cursor, or any cheap Chinese model. In fact I know many engineers that have multiple subscriptions active and switch when they hit the rate limits of one, precisely the tools are so interchangeable.

At some point it could even become cheaper to just buy 8x H100s and host Qwen/Deepseek/Kimi/etc yourself if you’re one of those companies paying $3k/mo per engineers in tokens.

mattmanser2mo ago

I have non-tech friends telling me about preferring other models like gemini, this feels like the early days of search engines when people were willing to switch to find better results.

1 more reply

baq2mo ago

> It's a winner-takes-all market and everyone wants to be the next Google

absolutely isn't! if billed per token, there is no reason to be married to a single model family provider at all. the models have very different strengths and weaknesses, you should be taking advantage of this at all times.

wavemode2mo ago

people used to say this about search engines and web browsers, as well

regardless, eventually Google became the universal default for both. When it comes to software, the average person doesn't shop around for the technologically optimal choice, they just use what everyone else is using.

2 more replies

H8crilA2mo ago

Where to go next? I don't think anyone has gotten close to automating everyday PC usage, likely via screen capture and raw keyboard+mouse inputs. Imagine how much bigger would that market be than vibecoding.

wavemode2mo ago

tbh I don't think this use case is going to be as big as people seem to think

there are a lot of reasons, but in brief - I think AI desktop use is a product that the average person isn't going to get much value out of. to make an analogy - the creators of Segway thought people would buy them in large numbers, but it turned out most people don't mind walking manually (or at least, don't mind it enough to spend money on a scooter). I think makers of AI Desktop Use products are going to find out the same thing as it relates to everyday tasks like checking email and shopping.

1 more reply

zozbot2342mo ago

Automating GUI use is a silly idea when the AI can do much of the same things by getting access to a *nix command line - which is how all coding models work. It matters when driving proprietary apps or browsing websites that aren't providing a clean machine-readable API, not really otherwise.

delecti2mo ago

I don't think it's winner-takes-all. Google is Google in 2026 because Lycos and AskJeeves were bad in comparison. The average user doesn't care whose LLM they're using because they're all close enough. It's hard to see past the bubble bursting, but I expect most people will use multiple of them depending on context (Copilot via the integration in windows, Gemini via Siri on their phone, etc), likely without paying.

logravia2mo ago· 6 in thread

The thing I am struggling with is where is the impact of LLM tools, especially given the massive increase in token consumption from 2025 to now and the saturated presence of LLMs everywhere.

Naively speaking, I have so many expectations for the impact of this tech.

I'd expect a noticeable uptick in applications published on Google, Apple and Microsoft app stores. I'd also expect an uptick of games published to Steam. I'd expect an uptick in Github repos and libraries on PyPi.

I'd also expect some impact on the GDP ⸻ a non-negligible part of running a business is communication, planning, ads. Naively, I'd expect that LLMs should be able to both speed some of these things up and lubricate others.

I'd also expect that large corpos like Microsoft and Apple would have more resources to spare on the essential details of their OS like having a functioning taskbar or a predictable, consistent GUI.

I'd expect increased SAT scores or improved PISA results. Maybe even improved mental health, let's go wild.

It's strikes me as a reasonably useful tool, personally.

Yet, where are the goods in the aggregate?

atomicnumber32mo ago

Programming is a necessary but not sufficient condition for software products to exist. So while the programming has to be good, so too do many other things, like product vision, product management, project management, and of course there still needs to be feedback between all of the above so that engineering isn't implementing a misunderstood version of the product and that product isn't asking for 5 years and a PhD research team. And on and on and on. Typing the code is like 2-10% of actually ending up with a software project and it's more toward the 2% for a software business.

So while AI made coding maybe 110% faster, it has also made literally every other person in the process lose their gd minds and they're wanting to break or skip everything else in the process to just shit out code faster.

d2ssa2mo ago

Going faster only works WHEN you know EXACTLY (or close to it) what you want.

Going faster when experimenting? Nah you actually need a mix of slow and fast, and mostly slow stuff up-front.

There's a fundamental misunderstanding of how people actually do stuff imo - its akin to force fitting a square peg in a round hole. Im sure many are hoping its just a 'your organisation is designed wrong' problem. I doubt it though.

atomicnumber32mo ago

I meant 10% faster btw, typo

therealdrag02mo ago

The tech is still young and projects take time. And there are many slow parts of building that have not been accelerated (mythical man month).

I have started making an indie game, as one does, and it’s easily going 2-4x speed, but even still I’d predict a year of free time development with focus to ‘finish’ this thing. But the latest agentic tech is 3 months old.

tasuki2mo ago

> ⸻

Wow, I'm impressed at your usage of this. Apparently it's 0x2E3B, named "three-em dash".

You must be human!

logravia2mo ago

Oh yeah, a month ago I was reading a comment section about LLM writing tendencies and someone humorously suggested using the loooooooong-em-dash to distinguish yourself from LLMs. I found it so charming that I made my keyboard output it when I double tap "-".

On Linux you press Ctrl+Shifs+U and then type 2E3B, then press enter.

Chance-Device2mo ago· 6 in thread

From the beginning of this I’ve wondered the same question: how do these companies justify spending such massive amounts now (and 3 or 4 years ago) when software and hardware efficiencies will bring down the cost dramatically fairly soon?

They basically decided that scaling at any cost was the way to go. This only works as a strategy if efficiency can’t work, not if you simply haven’t tried. Otherwise, a few breakthroughs and order of magnitude improvements and people are running equivalent models on their desktops, then their laptops, then their phones.

Arguably the costs involved means that our existing hardware and software is simply non viable for what they were and are trying to do, and a few iterations later the money will simply have been wasted. If you consider funnelling everything to nvidia shareholders wasting it, which I do.

Aperocky2mo ago

The decision is the right one. Scaling at any cost is the right way to go.

You cannot find the efficiency if you haven't been experimenting at scale, this is true personally as well.

If someone haven't been burning a few B tokens per month, everything coming out of their mouth about AI is largely theory. It could be right or wrong, but they don't have the practice to validate what they're talking about.

Not everyone scaling to that degree would have the right answer or outcome, many would be wrong and go bust. But everyone who didn't will not have the right answer.

raincole2mo ago

Well said. Quantity itself is a quality.

In the worst of the worst case, they're building know-how of how to manage big datacenters, infra and data-labeling teams. These are incredibly valuable in the next few years. And no, no one, even the AI companies' executives themselves, believe that you can delegate business know-how to LLMs.

ap992mo ago

They're not just betting on the current tech, they're building out infra like this because probably any future tech currently being researched will also require massive data centers.

Like how the gpt llms were kind of a side project at openai until someone showed how powerful they could be if you threw a lot more parameters at it.

There could be some other architecture in the works that makes gpts look old - first to build and train that new ai will be the winner.

phito2mo ago

I think their current goal is to capture as much market as they can while they still have the best models, their only moat. Look at Anthropic, they are clearly trying to lock their users in their ecosystem by refusing to follow conventions (AGENT.md etc) and restricting their tools exclusively to their own services.

mrob2mo ago

Because whoever wins the AI race (assuming they don't overshoot and trigger the hard takeoff scenario) becomes a living god. Everybody else becomes their slave, to be killed or exploited as they please. It's a risky gamble, but in the eyes of the participants the upside justifies it. If they don't go all in they're still exposed to all the downside risk but have no chance of winning.

I don't expect hardware prices to go down unless the third option (economic collapse) happens before somebody triggers the dystopia/extinction option.

WarmWash2mo ago

Just to add some slight nuance but is an important distinction,

They aren't all necessarily racing to be "god", some are racing to make sure someone else is not "god".

If it weren't for Altman releasing ChatGPT, it's very likely that we would have markedly less powerful LLMs at our disposal right now. Deepmind and Anthropic were taking incredibly safe and conservative approaches towards transformers, but OAI broke the silent truce and forced a race.

aurareturn2mo ago· 6 in thread

This is an awful article. I don't know how it reached #1 on HN.

Bottom line is that H100 prices are near 3 year highs, A100s are still profitable to run, B200 prices are increasing, no one has enough compute. Google, OpenAI, Anthropic, Meta, AWS, Azure are all compute constrained. Every single one of them said so publicly. Neo clouds are telling customers they're all sold out now and you even have to book compute in advance if you're an AI company.

  OpenAI is struggling to monetize. They turned to showing ads in ChatGPT, something Sam Altman once called a “last resort”, while Anthropic is crushing them with the more profitable corporate customers and software engineers.

AI bubble is bursting because OpenAI is trying to monetize free users on ChatGPT with ads but Anthropic is kicking butt in AI. What kind of logic is that? So it seems like AI can be monetized as Anthropic shows. Is AI going to burst because OpenAI can't monetize but Anthropic can?

  I wouldn’t be surprised at all if in the next couple of quarters we see OpenAI looking for an exit. It will be interesting because the sizes are now so big that we will probably know all the details. The most likely buyer is Microsoft, they already own a lot of it, and because of that, they are the most interested in showing a win.

I'll take the opposite stance. I think OpenAI is going to be bigger than Microsoft in market cap within the next 3 years. I think Anthropic and OpenAI are going to run laps around current big tech except maybe Google. For example, in a few years, I think AI agents could completely replace Microsoft Office, Microsoft's cash cow.

  Independent reports state that Claude metered models are priced 5x more expensive than their subscribers pay

Already dispelled. It isn't 5x more expensive than their subscribers pay. Inference has a gross margin of 50%+. It's been repeated over and over again by Anthropic CEO, OpenAI CEO, and just about anyone who's done deep analysis on token profitability. If you don't believe OpenAI and Anthropic CEOs, just look at inference providers on Openrouter. They don't have VCs backing them selling tokens at a loss. They should be making margins on every token in order to keep the lights on.

doom22mo ago

> Bottom line is that H100 prices are near 3 year highs, A100s are still profitable to run, B200 prices are increasing, no one has enough compute.

Then why aren't the hardware manufacturers of components needed by AI companies making plans yesterday to bring new fabs online to meet demand? That isn't a gotcha question, I genuinely want to know. The money involved isn't that much compared to the money changing hands between Nvidia Microsoft, OpenAI, etc., and it's not like once in-progress data center construction is complete they won't need to buy more RAM and GPUs, especially with any new advances in technology that might happen.

Inevitably someone will reply that hardware manufacturers don't want to be stuck losing money on a facility because the bubble popped and demand disappeared, but if Anthropic and OpenAI are going to "run laps around current big tech", it should be a no-brainer to increase production capacity.

jsnell2mo ago

A new fab will need to be filled with advanced equipment like lithography machines. They are the most complex thing humanity has every built.

There is one supplier of EUV lithography machines in the world, ASML. They are basically acting as an integrator for hundreds of highly specialized components manufactured to unimaginable levels of precision. Each of them has roughly one eligible supplier in the world who are operating at full capacity. To expand, they'll need yet another set of specialized and almost impossible to build equipment.

So the supply chain moves incredibly slowly, and the slowness is intrinsic due to the complexity and depth of the supply chain. It can't be fixed with just money. IIRC ASML is aiming to merely double their production of EUV lithography machines by 2030.

1 more reply

HackerThemAll2mo ago

> I think OpenAI is going to be bigger than Microsoft in market cap within the next 3 years.

I am yet to see how a one-legged business model with just a single product (that is not crude oil), without a plan and money is going to become sustainable. Oh yeah, maybe they'll finally make money on those autonomous lethal weapons. That sounds the easiest.

1 more reply

nunez2mo ago

> I think AI agents could completely replace Microsoft Office

How? What do you think lawyers/government will use to write briefs?

1 more reply

the_gipsy2mo ago

> but Anthropic is kicking butt in AI

that's not what the article said:

> They turned to showing ads in ChatGPT, something Sam Altman once called a “last resort”, while Anthropic is crushing them

1 more reply

veunes2mo ago

OpenAI overtaking Microsoft? Seriously? Microsoft has a massively diversified business spanning from gaming and cloud infra to B2B software that the entire world runs on. OpenAI has exactly one product (matrix weights), which is getting heavily commoditized by open-source models every single day. Once a theoretical Llama 4 catches up to GPT-5, an API price war is going to completely nuke their hyper-margins

1 more reply

256BitChris2mo ago· 5 in thread

I could see OpenAI hitting financial issues which triggers some media induced panic and for people to claim the AI bubble has popped.

However, the core utility of the best AI (read: Anthropic's ATM, by miles), will still exist and be leveraged by those who have learned to use it well.

I could also see the exponentially declining power requirements offsetting the exponential-but-slower rate of AI compute demand, which then renders a lot of unused capacity in these massive data centers.

I think of it like the old mainframes in the 70s which would take an entire city block to run, and now we have the equivalent of millions, if not billions of them in our pockets.

baq2mo ago

Anthropic isn’t the best by any reasonable measure. They’re the best in some areas and get pwned in others.

In general AI is very much like human intelligence in the regard that no two models are the same just like no two people are the same. IOW if you are a single model shop you might even not have any idea that you’re falling behind.

jqpabc1232mo ago

I think of it like the old mainframes in the 70s

I think this is a good comparison to current AI.

billions of them in our pockets.

AI in your pocket (but first on the desktop) is a real possibility.

cmrdporcupine2mo ago

The coming months are the reckoning in which the poor quality of the tooling and the safeguards around them become evident and hopefully eventually rectified.

By which I mean the competent organizations are the ones that will come up with cultural and technical solutions to manage the quantity and quality of the code better.

Others will suffer severe quality issues. Not because the "AI"s produce inherently inferior code but because the volume of the code is too high to manage review of, and to have good internal organizational knowledge of to manage the pages in the middle of the night when servers go down because of code nobody really understood.

I produce masses of independent project work all day long in my spare time using these tools and they blow me away. But in the context of professional work on teams of other coworkers the results are difficult to reason about and often impossible to competently review and it's not clear the results are superior. ' IMHO companies that drink too deep from the well without caution could be burned badly.

Aside:

I hate to say it, but there is no sense in which Anthropic has the clearly better product than OpenAI at this point. I know Claude caught developer's hearts through the fall, but GPT5.4 is a more powerful, careful, and competent model for coding and Codex is a far less buggy and more performant TUI. For the last 3 months I've gone back and forth between the two and I always run anything written by Claude Opus 4.6 by myself and my coworkers through Codex for review and it is constantly finding severe correctness issues to the point where I simply won't subscribe to Anthropic's product anymore.

On top of that, OpenAI provides far higher token limits. Even their $20 plan goes quite far.

If I was just building crud websites, probably Claude Code would be fine, and it does indeed show more "initiative" and "imagination" but I've seen it build way too many race conditions and correctness issues to trust it or the work my coworkers make with it.

_puk2mo ago

A lot of anthropic's recent improvements are coming from the task focus and improved orchestration around the models, not purely massive changes in the models themselves.

This bodes well for us being at a point that even if the bubble burst, we'd still have usable AI going forward.

eieje2mo ago

It’s pretty much undeniable at this point that the sentiment has changed.

About 2 months ago this place was unbearable - filled with doom and hype AI posts. I welcome the calming and eventual slow release of the bubble.

qoez2mo ago· 3 in thread

History doesn't have to repeat. There's barely anything else going on in terms of innovation, and AI is a real step function technology. We might be overspending but there's no way we're getting another AI winter like last time (remember last time investment in 90s AI had to compete for resources with the internet boom).

hk__22mo ago

Isn’t that covered at the top of the post?

> AI is here to stay. If used right, chances are it will make us all more productive. That, on the other hand, does not mean it will be a good investment.

joefourier2mo ago

The dotcom bubble burst and 26 years later we’re all hopelessly addicted to the internet and the top companies on the stock market are almost all what would have been called “dotcoms” then.

The railroad bubble burst in 1846 not because trains were a dead end - passenger number would increase more than 10x in the UK in the following 50 years.

lionkor2mo ago

> History doesn't have to repeat

This is high up there on the list of things people say before, you know, it does

jqpabc1232mo ago· 3 in thread

Another possibility not really addressed here --- local LLMs.

AI on hardware you own and control --- instead of a metered service provider. In other words, a repeat of the "personal computing" revolution but this time focused on AI.

TurboQuant could be a key step in this direction.

schnitzelstoat2mo ago

Yeah, I don't think local LLM's will keep up with what the massive corporations put out. But they might get to a level of performance where it just doesn't matter for most users.

And people would prefer to run a model locally for 'free' (not counting the energy cost) rather than paying for an LLM subscription.

zozbot2342mo ago

TurboQuant helps KV quantization which is not very relevant to local LLMs, since context size becomes most relevant when you run inference with large batches. For small-scale inference, weights dominate. (Even if you stream weights from SSD, you'll want to cache a sizeable fraction to get workable throughput, and that dominates your memory usage.)

netdevphoenix2mo ago

Local LLMs don't sound profitable at all for those building them. If you really wanted a SOTA model, you would be paying eye watering amounts to own it unless you got an open sourced one.

1 more reply

ajay-b2mo ago· 2 in thread

I would be very sad to lose services like ChatGPT. It has significantly improved my workflow by digesting and analyzing huge documents, and helping me to synthesize and respond better. May be I am part of a minority.

coder682mo ago

The good news is local models have significantly improved. If it all goes down today, you can still run e.g. Qwen 3.5 at home, and it's "good enough" for most workloads.

With a gaming GPU you can run Qwen3.5-35B-A3B. I use 122B-A10B on my local rig (1x6000 Pro), and 397B-A17B on my 2x6000 Pro server (some spillover into CPU/RAM). It's pricey now but probably within a few years it'll become very affordable.

raincole2mo ago

Don't worry lol. It's not going anywhere. The article is just ragebaitng. Verbatim:

> Anthropic is already in a push to reduce costs and increase revenue

Yeah, it's totally a bad sign when a company tries to... reduce costs and increase revenue.

1 more reply

NickNaraghi2mo ago· 2 in thread

> Taking this into account, Google is extremely well positioned to weather the storm. When they announce capex expenditure, they don’t spend it overnight. They can simply deploy month by month until their competitors struggle to raise and get forced to capitulate. At that point they can just ramp down the spending and declare victory in a cornered market. They don’t need capex, they just need to make it very clear for everyone that nobody can outspend them.

Have you tried Gemini 3.1 lately? It is not even close to Opus 4.6 never mind Claude 5.

This post, like many pessimistic takes, seriously discounts innovation and the exponential takeoff of recursive self-improvement.

endymion-light2mo ago

Exponential take-off is great until it stops- genuinely, what are the signals showing any of the large models are performing exponential takeoff and recursive self-improvement?

Currently a lot of that appears to be marketing hype to drive up usage. Is it exponential, or are the labs spending exponentially more for smaller and smaller gains from LLMs?

bogzz2mo ago

What recursive self-improvement?

1 more reply

franze2mo ago· 2 in thread

.... so what? the technology exists, the models exist. Even when the bubble bursts things will not go to the state "before AI". Even if model development would stop today (not the worst thing to happen) it would still be the most impactful invention since the printing press

hk__22mo ago

Yes, that’s what the author wrote in the second sentence of the post: "AI is here to stay."

irusensei2mo ago

I guess the point is that without the hype subsiding it enshitification will ensue.

general_reveal2mo ago· 2 in thread

HN is no longer a reliable place for the truth. Quite frankly, unless you are utterly self educated, you are terribly vulnerable to this place.

At this rate, I’d almost prefer to talk on a private mailing list with vetted resumes.

rvz2mo ago

> HN is no longer a reliable place for the truth.

"No longer?" It never was.

Especially with AI boosters being allowed to degrade the comments section and shilling their paid blogs and violating the HN guidelines.

1 more reply

myspy2mo ago

Why?

1 more reply

hyperpape2mo ago· 1 in thread

> Magnificent 7 companies are increasing capex to their biggest ever to differentiate their tech from each other and the big AI labs, but the key realization is that they don’t have to spend it to win. It’s a defensive move for them, if they commit $50B, OpenAI and Anthropic need to go raise $100B each to stay competitive, which makes them reliant on investors’ money.

Stay competitive how? If the Magnificent 7 aren't spending the money, then how could it possibly hurt OpenAI/Anthropic to not raise equal amounts of money? Maybe you can pull together an explanation, but this author didn't even try to do so.

This piece seems poorly thought-out, but well designed to get shared.

Promote writers who will actually explain their claims carefully.

martinvolOP2mo ago

they have to fight to stay competitive because mag7 can outspend them, but my hypothesis is that they wont need to ultimately.

ethagnawl2mo ago· 1 in thread

> If investor money dries up, they will be forced to cut their losses and pass the true costs to their users.

I do not see this talked about often enough whilst everyone is in the process of introducing hard dependencies on these services into their workflows.

senordevnyc2mo ago

Really? Virtually every AI thread on HN has multiple people promising doom and gloom once the labs start passing the "true costs" onto the users. This very post has multiple deep comment chains arguing about this!

KaiserPro2mo ago· 1 in thread

The problem with these kind of posts is that "How" is almost useless, I can tell you how the bubble pops: The value of these AI companies crash and take out a lots of other stuff with it.

The interesting questions are: "What triggers it" and "what also goes tits up"?

The issue with high/international finance is that a good percentage of it (if not more) is fraudulent or semi fraudulent bollocks.

"Here is a startup that is worth x million because y" Both of those statements are bollocks. However its in the interest of most people to agree with that bollocks to get money. If enough money is given there is a chance that the startup will make money.

If we look a few year back, NFTs fulfil that niche quite nicely. It was obviously bollocks, but a very convenient way to launder money, or run a series of rugpull operations.

The problem we have to contend with now is that the sheer amount money that has been invested all disappearing at once would require 2007/8 levels of coordination to unfuck. The US government does not have the requisite number of admins to pull that off again, and no political will to ever have that expertise again. So if AI does go pop, and it takes a lot of money with it, I would put a guess on china doing the money lubrication and extracting a subtle but richly ironic level of control in exchange

Also, its no guarantee that AI will trigger the next bubble popping, my money is on Private Equity.

martinvolOP2mo ago

> The problem with these kind of posts is that "How" is almost useless, I can tell you how the bubble pops: The value of these AI companies crash and take out a lots of other stuff with it.

That's like saying "I know exactly how you're going to die, your heart will stop"

shubhamjain2mo ago· 1 in thread

> OpenAI is struggling to monetize. They turned to showing ads in ChatGPT, something Sam Altman once called a “last resort”, while Anthropic is crushing them with the more profitable corporate customers and software engineers. Their shopping feature flopped and they shut down Sora, both supposed to be revenue drivers.

I don't think Sora ever thought of as a "revenue driver" considering how notoriously expensive and unpredictable video generation via inference is. OpenAI is just a repeat of Uber—minus the scandals—in a different decade. Uber got itself into tons of businesses related to transportation on the assumption that it would all be viable "one day." Same stuff that OpenAI is going.

I would say, once the bubble bursts—which is likely, considering the geopolitical environment—OpenAI, Anthropic, and Alphabet are likely to be the winners, with a lot of small players at the tail end. Anthropic won over programmers and OpenAI on everyone else. For millions of people, AI = ChatGPT, so I would bet that OpenAI can still become profitable, once they cut down their expenses.

JohnTHaller2mo ago

> minus the scandals

Given the tech bros involved, we just don't know about them yet. Also was this comment generated using AI? Look at all the em dashes.

hnthrow02873452mo ago· 1 in thread

I don't see this bubble really popping as-in sinking the economy. Some circular investing and enough write offs will happen to avoid the largest recession indicators from informing the general population that there's actually a recession. You also have a government willing to do shady shit for their own benefit at the expense of responsible governing and ethics, and we have already seen the business leaders of the biggest tech companies cozy up to the administration.

My guess is that cloud companies will scoop up the data centers for pennies on the dollar and the GPUs get written off or fire-sold to enthusiasts still wanting to run local models. Then they can offer exceptionally low initial prices to new customers and get more people to be locked in. Or maybe we see a couple of new cloud companies start up but that would likely need lower interest rates.

lstodd2mo ago

DC infra will be scooped up by cloud guys, that's a given. As for GPUs.. well low-precision tflops have other uses besides inference. You can run Doom for example.

HardCodedBias2mo ago· 1 in thread

In general:

Cynicism makes you sound smart. Optimism makes you successful.

The cynicism around this technology is everywhere, even though it clearly has real power to solve problems. It is a technology which enables so many use cases that were impossible before, that makes it very highly hyped/expected. And that is causing an immune (over) reaction by natural skeptics, that's an error.

People need to take a measured, reality based, view of how the technology is being used today, the adoption curve, and the increase in capabilities over time.

It's clearly being used strongly, and may even be revolutionary.

Bubbles burst when there's no 'there' there. AI has an undeniable 'there'—the only question is the timing of the ROI.

martinvolOP2mo ago

bubbles are created by people investing more than reasonable in something, independent of the actual value it will generate for society.

lnfromx2mo ago· 1 in thread

Okay lets suppose all those companies are profitable if training would stop today. What if token demand is shrinking ? I think big parts of the current demand is artificially build by e.g. FOMO and marketing without real value generated by them. There is no indication in economic data about some productivity boom resulting from AI usage. Next thing is Energy costs - that will soon eat into profitability too. I don't see how this bubble can't burst.

martinvolOP2mo ago

I don't think token demand will shrink because we're still just learning how to use it, demand will skyrocket. The problem is what price we'll be willing to pay for it, specially if competition keeps soaring.

2 more replies

elorant2mo ago· 1 in thread

I feel that even if the bubble bursts hardware prices will still take years to normalize. So no clear benefit for the average consumer here.

baggachipz2mo ago

Consumers and retail investors will bear most of the brunt from this bubble. Even taxpayers, as the government will most likely bail out the "too big to fail" ai companies in the "race against China". All based on bullshit, hype, and greed.

monegator2mo ago· 1 in thread

> How this affects you?

> checks list ...

nope, nothing will either directly or indirectly affect me. Let it happen sooner, rather than later, and unleash the mobs at the tech bros that set the world on course to make everybody's life more miserable. We'll still be here to get the scrapped RAM and GPUs to train and infere local models thank you very much.

coffeebeqn2mo ago

The current best models are already very capable of disrupting the job of millions of people. I don’t think a scenario where we just go back to pre-Claude Code exists and I’m sure the same models can be tuned for much of other white collar work at similar capability

2 more replies

richard___2mo ago· 1 in thread

Complete bs.

martinvolOP2mo ago

great feedback

1 more reply

tracker12mo ago

Datacenters themselves are really weird... most of the announced 2024 data centers are nowhere near completion, most of NVidia's production is taking longer to deploy than to produce and will be upwards of 2+ years behind on deployments sometime in the next year.

That doesn't even begin to cover the lack of actual electricity to power the data centers. We have more "dark silicon" sitting in boxes that aren't close to being deployed, while a lot of actual people can't manage to buy consumer products for anythign resembling reasonable... it's kind of insane to say the least.

beepbooptheory2mo ago

Just checked and my API bill for this stuff is about $2.50 this month. Am I really the minority here? I know there is a lot of kids into the openclaw and paying for subscriptions and stuff, but after that literally no one I know (who isn't a developer) is paying for it, and seemingly would never dream of paying for it. It would be like paying for Gmail to them I think.

I just dont understand why it justifies so much spending!

agentultra2mo ago

It sounds like most of the data centers promised in 2025 and 2026 are not even built yet and most of the GPUs bought haven't even been installed.

If it does all go down in flames, even floor value is not going to be that valuable.

I can't predict the future but it's smelling a lot like a recession already under way that is bigger than the sub-prime crash.

mvdtnz2mo ago

> And independent of whether Microsoft makes money or not in their OpenAI endeavor, it kills the story: they were betting the whole growth story on AI, and if that doesn’t work out, then what’s left to justify a high stock price?

Microsoft's stock price today is the same as it was in late 2021 before anyone cared about AI. What would happen? Nothing. I don't think it's a significant revenue driver today. Microsoft, like everyone else, is speculating that AI will drive profits in the future. If it all fell apart there will certainly be losers but I don't see why it would bring down Microsoft.

skeeter20202mo ago

>> Building a datacenter is supposed to be a “safe” investment in normal times, so banks give private credit and mortgages to finance them.

Except the investment is more like a railway or utility. It generates like 3% return, which is definitely not good enough for the people providing the money, or (in the case of the profitable companies) anywhere near the double-digit returns they make on their technology products. I won't be surprised when we see consolidation of marginal players and abandonment of the losers, just like you can find rail lines to nowhere, and fiber that's never been used.

Lerc2mo ago

A lot of this make me imagine an Aeroplane flown by a mad pilot, overloaded and running out of fuel. The passengers are all blaming the guy sitting in the back knitting a parachute and telling him that the chute will never work because the wool is the wrong colour.

The tragedy is when it's all over one of the surviving passengers will go "See! I knew we were going to crash because of that knitter"

HackerThemAll2mo ago

Excellent reading to realize how the rich greedy investment monkeys with no plan other than "let's build a data center" will ultimately drag the market and the economy down. This time it may not explode as abruptly as in dotcom era, but will slowly sink as the stupid US data center boom proves unprofitable. Billions burned for nothing more than a run for the money.

EternalFury2mo ago

If somehow recovering the capex expenditure is not counted, if somehow the cost of developing future models is not counted, then yes, inference costs of current leading models allow a profit.

But those things are tied together.

Even xAI, that now has a reasonably competitive model, is struggling to achieve PMF. Meta is in shambles because their models have underperformed for years now.

titzer2mo ago

The article says "...and RAM prices are crashing because new models won’t need as much," and I went and read the link. The link was a puff piece for a very specific compression mechanism that...no one is using?

I do hope that RAM prices come down but this was just wishful thinking.

positron262mo ago

When will this concern farm end? Internet is ant-milling harder than a model gone psychotic on synthetic data. Call me when it's over.

Back to the mines. The Vulkan only writes itself when prompted with well-conditioned problem statements.

thebeardredis2mo ago

Hopefully soon. My new unwords are f.e. "agentic".

Havoc2mo ago

Gov bailout seems like the only way out.

m12k2mo ago

Remember, having the dot com bubble burst did not prevent the internet from being integrated more and more in society over the next couple decades. What it did was stop the headless investment where money was thrown at anything that tangentially could be called "online". We went from "nobody knows what this is, but everyone wants a piece of it" to "we know what it is, and we sure did pursue a lot of bad ideas when we didn't". Expect something similar to happen with AI - having the bubble burst will not stop it in its tracks, but it will change what gets invested in.

nexos2mo ago

I think ultimately the AI bubble is bound to burst solely based on the fact that no AI company has turned a profit. A business model consisting of pure speculation on profitability when profit has not come in for 4 years now indicates that the tech industry is over-betting on AI. That plus consumer backlash at the way AI is jacking up consumer prices on RAM and etc means that the bubble is bound to burst. To paraphrase Linus Torvalds, AI is a helpful tool but I look forward to the day it’s a regular part of life and the hype cycle ends

relation_al2mo ago

RAM's dropping? Woohoo!

LarsDu882mo ago

The world has seen this play out before. Launch a service, sell it at a loss to achieve hypergrowth, raise prices add ads and enshittify.

The thing that is difference is the scale and the hardware. When Britain underwent its rail building boom in the 1850s, the bubble bursting left the kingdom with 150 years worth of infrastructure. Unless we invest in energy buildouts, we will be left with billions in rapidly depreciating GPUs

nickcageinacage2mo ago

AI is shit. I just want this to be over. Can we move on

post-it2mo ago

Cheaper hardware, discounts on stocks, and we keep AI itself? My flavour of hopium, sign me up.

jarek832mo ago

I wonder if AI labs could be bailed out - like banks.

See, they kind of became a national asset and letting it go down, will leave USA watching China taking the lead for a very long time ahead. It just can't happen - right? So we'll just all fund it in taxes.

j / k navigate · click thread line to collapse

523 comments

226 comments · 45 top-level

joshstrange2mo ago· 42 in thread

> RAM prices are crashing because new models won’t need as much

[0] https://pcpartpicker.com/trends/price/memory/

[1] https://tech.sportskeeda.com/gaming-news/how-google-s-new-tu...

amelius2mo ago

> Now if that means RAM prices come down (as speculated, not reported on, in the link) or the AI companies just do more things with their extra ram is yet to be determined.

I think it is determined:

https://en.wikipedia.org/wiki/Jevons_paradox

woadwarrior012mo ago

Yeah, even if one efficiency trick lands, people will end up spending the saved budget right back on bigger models, and/or more "thinking" tokens.

1 more reply

pydry2mo ago

Jevons paradox only applies if demand hasnt already been saturated.

8 more replies

fotcorn2mo ago

Also, there is zero reason to think that the big labs did not have anything similar to TurboQuant for a long time already.

The recent blog post from Google announcing TurboQuant does not change anything regarding RAM planning for the big labs.

TurboQuant itself is already a year old! So even smaller labs have probably seen and implemented it.

scw2mo ago

1 more reply

schmidtleonard2mo ago

The open source tooling got quantization support 3 years ago! It was a lesser type of quantization, but more than enough to prove that the savings just go to bigger models.

adjejmxbdjdn2mo ago

drakythe2mo ago

2 more replies

ToucanLoucan2mo ago

If they see them. Plenty of businesses are still charging pandemic prices for all kinds of goods and simply pocketing the difference.

Cars come to mind instantly. Prices exploded in 2020/1, due to legitimate shortages, most of which have been plus or minus resolved, but the prices for new (and used!) cars never came back down.

2 more replies

eru2mo ago

Why would they be lagging?

slfnflctd2mo ago

> almost as bad as when LLMs link things to prove their point, you visit the link, and find it says nothing of the sort or even the opposite

To be fair, they got it from us. This happened to me plenty of times long before modern LLMs.

throwup2382mo ago

It learned by reading HackerNews, after all.

ajross2mo ago

> Reality begs to differ

T-A2mo ago

> RAM prices spiked speculatively, and they're going down for the same reason.

https://pcpartpicker.com/trends/price/memory/

Note how flat the black lines are.

Then note how wide the gray bands are. That makes it very easy to cherry-pick a few examples to present as "supporting evidence" that prices are doing whatever you want to believe they are doing.

1 more reply

Forgeties792mo ago

cma2mo ago

> RAM prices spiked speculatively

Didn't OpenAI buy up 40% of the capacity all at once?

1 more reply

h14h2mo ago

In any case, I think it's pretty fair to speculate we may be seeing RAM prices start falling sooner rather than later.

joshstrange2mo ago

> In any case, I think it's pretty fair to speculate we may be seeing RAM prices start falling sooner rather than later.

I sure hope so. RAM, HDDs, and SSDs are all crazy-high right now and I was in the market for literally all 3 but have paused all my buying because I can't justify the costs as they stand today.

1 more reply

layer82mo ago

martinvolOP2mo ago

RAM prices haven't crashed yet and it'll take time because it has to propagate within the supply chain. Micron is -20% from the top already https://www.investing.com/equities/micron-tech

Stock price is the best forward indicator I can think of

cwillu2mo ago

That might be true, but it's still straightforwardly wrong to say that RAM prices have crashed, and it calls into question everything else they write.

1 more reply

albinn2mo ago

I would think that we are going to see RAM prices increase even more, given, among other things, pure helium disruptions and increased electricity prices.

I haven't looked closely into TurboQuant, but perhaps it will revolutionize just as much as the 1-bit llm did...

aurareturn2mo ago

Even if TurboQuant, which was released a year ago, drastically lower RAM requirements, AI labs will just release bigger models.

Jevons Paradox. When are we going to learn that efficiency gains in AI does not decrease hardware usage?

functional_dev2mo ago

valid point, it reminds me of video games. GPUs got faster, devs pushed higher resolutions, more complex lighting instead of saving power :)

gmerc2mo ago

consumer ram is starved by production capacity shifting to HBMs. Hbms dropping in price would not affect consumer RAM on any immediate timeline. Also, as pointed out by many, Jevons Paradox

am17an2mo ago

Thank you, there are two things I would like to point out:

2) But prices for memory companies are crashing! look around, the whole market is crashing.

sandworm1012mo ago

veunes2mo ago

maeln2mo ago

> > RAM prices are crashing because new models won’t need as much

JCTheDenthog2mo ago

1 more reply

incognition2mo ago

You nailed it. It's algos and noise trading

faangguyindia2mo ago

If the gains are real why the limits are so bad? Google can barely serve Anti-gravity.

BoredPositron2mo ago

You get more Claude tokens from a Google subscriptions via antigravity than from anthropic. Especially if you use the 5 other "family" accounts you can share the subscription with...

owlmirror2mo ago

Isn't that at the moment still a free product? Of course they will not prioritize serving those requests. That tells you nothing.

3 more replies

tracker12mo ago

Even worse, 3 memory companies control well over 90% of the international market, with a history of cartel collaboration that's going to be ever harder to prove with fewer companies.

hintymad2mo ago

Some also argue that the RAM price keeps rising because of the bullwhip effect. I was wondering if there's anyway for us to differentiate a sustained demand from the bullwhip effect.

Valakas_2mo ago

This article and the title are total clickbait filled with emotional hooks. And it worked. You totally debunked it but look how it still became so popular.

hirako20002mo ago

Not crashing yet. The article is looking 1 to 5 years to come.

Given Nvidia's CEO's agitation I would give credit to the prediction, and if it's correct the price will go back to what it was, or even lower of investment in capacity are made today.

michaelcampbell2mo ago

My take is new capabilities will consume any price reductions, making them moot. At least in the medium term.

A RAM price drop due to some magic efficiencies assumes everything else doesn't change, which I doubt anyone honestly thinks will be the case.

sigmoid102mo ago

Forgeties792mo ago

Sometimes it's real easy to see who has risky short positions right?

mNovak2mo ago

This feels similar to when Deepseek first debuted with claims of ultra-low cost training, and all the pundits exclaimed that Nvidia was finished, the bubble had burst, etc.

infecto2mo ago· 39 in thread

Aperocky2mo ago

Demand of tokens is absolutely skyrocketing.

This feels like a boom before bust scenario, and I'm not even sure if it will bust.

skeeter20202mo ago

2 more replies

WarmWash2mo ago

>potentially fail given the current quality of the output.

The question is how big the fail is if you measure it in 3 month increments going back to late 2022.

1 more reply

hirako20002mo ago

Tulips sales also skyrocketed.

Seriously, what value are tokens providing other than justifying layoffs. Concretely. Today. Not in the speculating scenario that cardiologist could be replaced with models.

11 more replies

gdilla2mo ago

the busting will come from the token consumers. so many disasters waiting to happen.

boriskourt2mo ago

> The cost to serve tokens is absolutely profitable today and that’s been true for at least a year.

> For the data center build outs, demand for tokens is still exceeding supply.

Can you provide any numbers for this?

wongarsu2mo ago

5 more replies

ACCount372mo ago

Check the token prices for open weight LLMs at various independent inference providers.

That gives you a very good estimate of "how much can you serve the tokens of a model of the size N for while making a profit".

Now, keep in mind: Kimi K2.5 is 1T MoE. Today's frontier LLMs are in the 1T to 5T range, also MoE. Make an estimate. Compare that estimate with the actual frontier lab prices.

1 more reply

bob10292mo ago

https://www.cerebras.ai/blog/cerebras-cs-3-vs-nvidia-dgx-b20...

paulddraper2mo ago

Anthropic has said inference is profitable. That’s a biased source, but the math pencils.

This is why switching to local open weight models saves a lot of money. (Even though it’s not apples to apples.)

2 more replies

infecto2mo ago

For supply look at outages and growth rates at companies like openrouter. The demand is growing every week.

iterateoften2mo ago

According to open router token demand is growing at something like 10% a week

It’s insane

infecto2mo ago

1 more reply

chrisweekly2mo ago

> "decades of overinflated engineering salaries"

'Overinflated' relative to what? You make some good points but I don't accept this as a premise.

schmidtleonard2mo ago

Overinflated relative to the wet dreams of the ownership class.

1 more reply

fcarraldo2mo ago

Well, not GP, but I do. Let’s look at the numbers:

Median senior SWE salaries in SF: https://www.levels.fyi/t/software-engineer/levels/senior/loc...

Median income in metro areas: https://www.cnbc.com/2024/07/11/the-median-salary-for-the-25...

Outside of the US this may be less true, but I took GP’s “most of us on HN” to mean people who work in US tech companies which are primarily concentrated in high COI areas.

3 more replies

malfist2mo ago

> The cost to serve tokens is absolutely profitable today

How can you possibly say that? Everyone knows that's not the case, these companies are losing money every day selling tokens. Revenue is not the same thing as profit.

dist-epoch2mo ago

There are private companies which rent/buy GPUs, run open-weight LLMs on them and sell the tokens. They absolutely make profit, and their clients think they get a good deal and are buying the tokens.

infecto2mo ago

1 more reply

naravara2mo ago

I think they’re losing money because they have to amortize the costs of training the models in the first place, which is where most of the resource sink is.

This is why they were freaking out about DeepSeek just taking the trained model weights and slapping an interface on it.

1 more reply

jeromegv2mo ago

Yep, especially if we look at what happened just last week, both Google and Anthropic have dropped how much you get out of your existing plans.

2 more replies

Tade02mo ago

Currently on a given day I'm chewing through approximately the equivalent of my lunch money, but where there's opportunity to extract wealth, someone will find a way to do it.

h14h2mo ago

With frontier models seemingly hitting diminishing returns in quality, I struggle to see a world in which gigantic, expensive, general-purpose models don't become increasingly niche.

bluedays2mo ago

It's already a job requirement for a bunch of places, they're just not listing it. I lost out on a job recently because I haven't used cursor ai.

1 more reply

dist-epoch2mo ago

Jensen is already talking about $1000/mil tokens soon.

2 more replies

thereitgoes4562mo ago

> The cost to serve tokens is absolutely profitable

Can you explain why you know better than the analyst at Cursor cited in this article?

iterateoften2mo ago

1 more reply

infecto2mo ago

Can you cite your source of an analyst at Cursor. I read the article and looking through the boatload of links but struggled to find what you are referring to. Ty

noelsusman2mo ago

That analyst was talking about subsidizing tokens through the subscription plans, which is a different claim.

1 more reply

SirensOfTitan2mo ago

This is a classic HN mistaking the map for the territory. R&D and capex absolutely figure into de-facto profitability and sustainability for AI labs, despite their separate treatment in accounting.

> well most of us here on HN have benefited from decades of overinflated engineering salaries being paid by often companies that were not profitable and not only unprofitable

ForHackernews2mo ago

> any company ingesting that much cash needs to justify its capacity to survive.

What, why? There are tons of low-margin capex-intensive business out there.

I think AI will end up like being like hosting. All the models will converge to being pretty-decent and the companies will have to compete on efficiency since they are selling a generic commodity.

You can already see Anthropic fears this scenario since they try so hard to make people use their first-party tools rather than plugging Claude in as a generic part of a third-party stack.

LLM hosting is the next VPS.

1 more reply

guzfip2mo ago

> Software is or was one of the few remaining arenas wherein a person can find a consistently.

I want to add something additional to this: it is one of the few fields that can afford middle or upper middle class lifestyle and is accessible.

But at this stage of life? I don’t have the time or money to spend a decade+ paying some institution tens of thousands of dollars to hopefully maybe have a real career.

Once software as a career dies, I suspect many will find themselves locked out the middle class for generations.

2 more replies

fcarraldo2mo ago

> Software is or was one of the few remaining arenas wherein a person can find a consistently

1 more reply

keybored2mo ago

> This is a really concerning perspective: people were paid what they were worth.

Outlets for capturing more of the value they create is entrepreneurship (Hello HN). Never any collective organizing. And entrepenurship is easily bought via aqcuisition.

Collective bargaining would have been relevant in case they ever get automated... by the very software they co-created.

[1] All from recollection since this is just news from the Frontier to me

[2] Of course the pay might be much higher now; this might have been a while ago

[3] when it isn’t simply exploited by corporations just using OSS without giving any back; a logical turn of events when no license or law forces them to contribute back

2 more replies

9rx2mo ago

> This is a really concerning perspective: people were paid what they were worth.

The parent comment doesn't discount that, only pointing out that "what they were worth" was inflated due to a speculative environment. Wherein lies your concern?

2 more replies

infecto2mo ago

The salary jab was probably a little harsh.

Your ending is a bit of a fizzle too. There are many capex intense businesses that do just fine.

nickphx2mo ago

step change? how? profitable? where did you read that? people want tokens? really? who are these people?

elzbardico2mo ago

In accounting, almost anything you want can be true, at least for some time.

Eridrus2mo ago

The article is just helpfully illustrating how artisanal you can make your slop if you really try!

Aurornis2mo ago· 18 in thread

This article tries to build upon a lot of half-truths or incorrect facts, like this:

> OpenAI is struggling to monetize. They turned to showing ads in ChatGPT,

The ads aren’t going into your paid plans (except maybe a highly discounted tier, depending on the market). The ads are a play to offer a free version. Having an ad-supported free tier isn’t new.

Izkata2mo ago

Sounds like it is new for ChatGPT though. That's also how it started with TV and Youtube, first on the free tier then expanding to the paid ones.

smt882mo ago

YouTube, Spotify, and most video steamers have zero ads on paid tiers. I never see video ads.

3 more replies

carlosjobim2mo ago

YouTube doesn't show ads on the paid plan. If you're talking about sponsored segments those would be impossible to moderate, and YouTube does offer easy skipping of those.

krferriter2mo ago

I've never had ads in my Youtube Premium

butlike2mo ago

This statement doesn't discount the original statement: that ads are going into GPT, which Sam called a last resort.

throwaway274482mo ago

> We have very strong indicators that inference is not a money loser for these companies and is likely very profitable.

Why is OpenAI specifically losing money hand over fist then?

aurareturn2mo ago

Training. But training costs are a smaller and smaller percentage of revenue as inference revenue grows faster than training costs.

1 more reply

gruez2mo ago

To be fair people aren't exactly bullish on the prospects of deepseek or z.ai either, it's just they're below radar so they don't get mentioned.

Kye2mo ago

Z.ai is at least owned by a public company, so there might be something in the financials.

https://en.wikipedia.org/wiki/Z.ai

https://www.zhipuai.cn/investor_relations/

But I haven't looked into it.

141132mo ago

> companies are supposed to lose money while they grow

> We have very strong indicators that inference is not a money loser for these companies

butlike2mo ago

They clearly have some vested interest/skin in the game. Not sure it's worth retorting that one.

monegator2mo ago

> Having an ad-supported free tier isn’t new having ads shoved in paid tiers isn't new either

raincole2mo ago

And it usually doesn't result in a market crash.

project2501a2mo ago

> This article tries to build upon a lot of half-truths or incorrect facts, like this:

yeah i was wondering why my bullshit detector was going off. This feels as if someone who cooks for Ramsey's kitchen is trying to predict the end of the market hike.

Mentlo2mo ago

We have strong indicators that inference is profitable on non-economically-valuable prompts. We don't have strong indicators that inference is profitable on economically valuable prompts.

And I say this as an AI enthusiast with <50% probability of a bubble burst in the short term.

mcv2mo ago

I've heard "They're losing money" since the 1990s. About Amazon and nearly every other tech company.

The strategy is always:

* Build something useful

* Give it away for free to get people exited

* Convince investors that this is going to rule the world

* Grow to dominate the world

* Enshittify

danaris2mo ago

arctic-true2mo ago

nopinsight2mo ago· 16 in thread

> nobody is sure if even their metered pricing is profitable

atwrk2mo ago

But are they actually profitable, or do they employ creative accounting where only parts of overhead expenses are counted against all of inference revenue, similar to what Uber did?

therealdrag02mo ago

Does it matter if it’s creative accounting? Uber is a great example of a company that everyone was certain would fail because it was unprofitable and now it succeeded and is profitable.

1 more reply

baq2mo ago

If they shut down all training today they’d be absolutely printing money for the next couple quarters and then die with a bang once the other lab releases the next frontier to the public.

4 more replies

martinald2mo ago

Yes I wrote a detailed article about this Forbes claim. https://martinalderson.com/posts/no-it-doesnt-cost-anthropic...

Key points - if you compare it to openrouter costs for ~similar sized models it is ~90% gross margin.

And this claim came from Cursor - not Anthropic!

shafyy2mo ago

Not counting training models as part of your gross margin is just creative accounting. It's an inherent part of being able to provde the service for OpenAI, Anthropic etc.

Lastly, I would not trust one word that comes out of an executive of an AI company (or any other large company, for that matter).

mrbungie2mo ago

> Lab executives insist that serving tokens is profitable.

I'd love to actually look at both usage + profitability from each user segment to see if their PxQ growth expectations from non-enterprise usage make any sense.

> Many independent providers price tokens of open-weight models at a fraction of Anthropic's prices.

Are those open-weight models as good as Anthropic? Are they the same parameter class?

est312mo ago

It's a loss leader but this is normal. Same has happened with Uber, Airbnb, Amazon, etc. Using VC money to buy marketshare and once you have it, you can milk it.

1 more reply

zozbot2342mo ago

> Are those open-weight models as good as Anthropic? Are they the same parameter class?

techpression2mo ago

I wouldn't trust those claims from any private companies, even public ones play the most insane tricks in earnings calls to inflate numbers or heck, just make up new ones.

I'm not saying they're wrong, but I don't take much stock in their words.

sunaurus2mo ago

dash22mo ago

3 more replies

gedy2mo ago

Buying and driving a new car off the lot costs the manufacturer nothing at that moment, but what happens before that is important to account for.

phantom7842mo ago

Do tokens just cover ongoing operating costs, or are they also able to pay back the cost of training that model originally?

pier252mo ago

So these companies will be profitable if training stops? Is that even a real possibility?

danaris2mo ago

Over the whole industry? No; they can never, ever stop training, or they'll cease to be useful at all very soon.

naravara2mo ago

That’ll probably be a while though, because each successive model tends to be a lot better than the last.

3 more replies

piker2mo ago· 11 in thread

hbn2mo ago

If I saw a helicopter crashed into a tree, I don't have to be a helicopter pilot to know it's not an ideal state of a helicopter and something/some people failed.

s1artibartfast2mo ago

your comment sums up the conflict.

I dont know if you noticed, but there was a shifting of the goal post from "sub-par" to something wrong/sub-optimal.

The best helicopter you can buy may in fact crash into trees sometimes.

1 more reply

Aperocky2mo ago

Sub par is not the right word, the right word is feature creep.

markdown have much less of that brilliance and thankfully I also needed none of it.

Last time I authored a word document is probably 2 years ago for a government interaction.

karolist2mo ago

You can have an opinion about a tool as a user, without ever having ability to create such a tool yourself, that's literally what every tech and auto reviewer does.

piker2mo ago

Sure, and the less you understand about the tool’s fundamental capabilities, the less useful your opinion is. The best reviewers have deep knowledge.

2 more replies

curtisblaine2mo ago

piker2mo ago

red_admiral2mo ago

MS Office should last a while if they stop calling it "Copilot 365 Office" or whatever it was.

tapoxi2mo ago

The state of GitHub and Windows 11 certainly qualify as sub-par.

sooperserieous2mo ago

1 more reply

piker2mo ago

1 more reply

schnitzelstoat2mo ago· 8 in thread

It's a winner-takes-all market and everyone wants to be the next Google and not the next Lycos or AskJeeves etc.

joefourier2mo ago

It’s absolutely not winner take all. LLMs have become a commodity and the cost of switching models is essentially nil.

At some point it could even become cheaper to just buy 8x H100s and host Qwen/Deepseek/Kimi/etc yourself if you’re one of those companies paying $3k/mo per engineers in tokens.

mattmanser2mo ago

I have non-tech friends telling me about preferring other models like gemini, this feels like the early days of search engines when people were willing to switch to find better results.

1 more reply

baq2mo ago

> It's a winner-takes-all market and everyone wants to be the next Google

wavemode2mo ago

people used to say this about search engines and web browsers, as well

2 more replies

H8crilA2mo ago

wavemode2mo ago

tbh I don't think this use case is going to be as big as people seem to think

1 more reply

zozbot2342mo ago

delecti2mo ago

logravia2mo ago· 6 in thread

The thing I am struggling with is where is the impact of LLM tools, especially given the massive increase in token consumption from 2025 to now and the saturated presence of LLMs everywhere.

Naively speaking, I have so many expectations for the impact of this tech.

I'd also expect that large corpos like Microsoft and Apple would have more resources to spare on the essential details of their OS like having a functioning taskbar or a predictable, consistent GUI.

I'd expect increased SAT scores or improved PISA results. Maybe even improved mental health, let's go wild.

It's strikes me as a reasonably useful tool, personally.

Yet, where are the goods in the aggregate?

atomicnumber32mo ago

d2ssa2mo ago

Going faster only works WHEN you know EXACTLY (or close to it) what you want.

Going faster when experimenting? Nah you actually need a mix of slow and fast, and mostly slow stuff up-front.

atomicnumber32mo ago

I meant 10% faster btw, typo

therealdrag02mo ago

The tech is still young and projects take time. And there are many slow parts of building that have not been accelerated (mythical man month).

tasuki2mo ago

> ⸻

Wow, I'm impressed at your usage of this. Apparently it's 0x2E3B, named "three-em dash".

You must be human!

logravia2mo ago

On Linux you press Ctrl+Shifs+U and then type 2E3B, then press enter.

Chance-Device2mo ago· 6 in thread

Aperocky2mo ago

The decision is the right one. Scaling at any cost is the right way to go.

You cannot find the efficiency if you haven't been experimenting at scale, this is true personally as well.

Not everyone scaling to that degree would have the right answer or outcome, many would be wrong and go bust. But everyone who didn't will not have the right answer.

raincole2mo ago

Well said. Quantity itself is a quality.

ap992mo ago

They're not just betting on the current tech, they're building out infra like this because probably any future tech currently being researched will also require massive data centers.

Like how the gpt llms were kind of a side project at openai until someone showed how powerful they could be if you threw a lot more parameters at it.

There could be some other architecture in the works that makes gpts look old - first to build and train that new ai will be the winner.

phito2mo ago

mrob2mo ago

I don't expect hardware prices to go down unless the third option (economic collapse) happens before somebody triggers the dystopia/extinction option.

WarmWash2mo ago

Just to add some slight nuance but is an important distinction,

They aren't all necessarily racing to be "god", some are racing to make sure someone else is not "god".

aurareturn2mo ago· 6 in thread

This is an awful article. I don't know how it reached #1 on HN.

  OpenAI is struggling to monetize. They turned to showing ads in ChatGPT, something Sam Altman once called a “last resort”, while Anthropic is crushing them with the more profitable corporate customers and software engineers.

  I wouldn’t be surprised at all if in the next couple of quarters we see OpenAI looking for an exit. It will be interesting because the sizes are now so big that we will probably know all the details. The most likely buyer is Microsoft, they already own a lot of it, and because of that, they are the most interested in showing a win.

  Independent reports state that Claude metered models are priced 5x more expensive than their subscribers pay

doom22mo ago

> Bottom line is that H100 prices are near 3 year highs, A100s are still profitable to run, B200 prices are increasing, no one has enough compute.

jsnell2mo ago

A new fab will need to be filled with advanced equipment like lithography machines. They are the most complex thing humanity has every built.

1 more reply

HackerThemAll2mo ago

> I think OpenAI is going to be bigger than Microsoft in market cap within the next 3 years.

1 more reply

nunez2mo ago

> I think AI agents could completely replace Microsoft Office

How? What do you think lawyers/government will use to write briefs?

1 more reply

the_gipsy2mo ago

> but Anthropic is kicking butt in AI

that's not what the article said:

> They turned to showing ads in ChatGPT, something Sam Altman once called a “last resort”, while Anthropic is crushing them

1 more reply

veunes2mo ago

1 more reply

256BitChris2mo ago· 5 in thread

I could see OpenAI hitting financial issues which triggers some media induced panic and for people to claim the AI bubble has popped.

However, the core utility of the best AI (read: Anthropic's ATM, by miles), will still exist and be leveraged by those who have learned to use it well.

I think of it like the old mainframes in the 70s which would take an entire city block to run, and now we have the equivalent of millions, if not billions of them in our pockets.

baq2mo ago

Anthropic isn’t the best by any reasonable measure. They’re the best in some areas and get pwned in others.

jqpabc1232mo ago

I think of it like the old mainframes in the 70s

I think this is a good comparison to current AI.

billions of them in our pockets.

AI in your pocket (but first on the desktop) is a real possibility.

cmrdporcupine2mo ago

The coming months are the reckoning in which the poor quality of the tooling and the safeguards around them become evident and hopefully eventually rectified.

By which I mean the competent organizations are the ones that will come up with cultural and technical solutions to manage the quantity and quality of the code better.

Aside:

On top of that, OpenAI provides far higher token limits. Even their $20 plan goes quite far.

_puk2mo ago

A lot of anthropic's recent improvements are coming from the task focus and improved orchestration around the models, not purely massive changes in the models themselves.

This bodes well for us being at a point that even if the bubble burst, we'd still have usable AI going forward.

eieje2mo ago

It’s pretty much undeniable at this point that the sentiment has changed.

About 2 months ago this place was unbearable - filled with doom and hype AI posts. I welcome the calming and eventual slow release of the bubble.

qoez2mo ago· 3 in thread

hk__22mo ago

Isn’t that covered at the top of the post?

> AI is here to stay. If used right, chances are it will make us all more productive. That, on the other hand, does not mean it will be a good investment.

joefourier2mo ago

The dotcom bubble burst and 26 years later we’re all hopelessly addicted to the internet and the top companies on the stock market are almost all what would have been called “dotcoms” then.

The railroad bubble burst in 1846 not because trains were a dead end - passenger number would increase more than 10x in the UK in the following 50 years.

lionkor2mo ago

> History doesn't have to repeat

This is high up there on the list of things people say before, you know, it does

jqpabc1232mo ago· 3 in thread

Another possibility not really addressed here --- local LLMs.

AI on hardware you own and control --- instead of a metered service provider. In other words, a repeat of the "personal computing" revolution but this time focused on AI.

TurboQuant could be a key step in this direction.

schnitzelstoat2mo ago

Yeah, I don't think local LLM's will keep up with what the massive corporations put out. But they might get to a level of performance where it just doesn't matter for most users.

And people would prefer to run a model locally for 'free' (not counting the energy cost) rather than paying for an LLM subscription.

zozbot2342mo ago

netdevphoenix2mo ago

Local LLMs don't sound profitable at all for those building them. If you really wanted a SOTA model, you would be paying eye watering amounts to own it unless you got an open sourced one.

1 more reply

ajay-b2mo ago· 2 in thread

coder682mo ago

The good news is local models have significantly improved. If it all goes down today, you can still run e.g. Qwen 3.5 at home, and it's "good enough" for most workloads.

raincole2mo ago

Don't worry lol. It's not going anywhere. The article is just ragebaitng. Verbatim:

> Anthropic is already in a push to reduce costs and increase revenue

Yeah, it's totally a bad sign when a company tries to... reduce costs and increase revenue.

1 more reply

NickNaraghi2mo ago· 2 in thread

Have you tried Gemini 3.1 lately? It is not even close to Opus 4.6 never mind Claude 5.

This post, like many pessimistic takes, seriously discounts innovation and the exponential takeoff of recursive self-improvement.

endymion-light2mo ago

Exponential take-off is great until it stops- genuinely, what are the signals showing any of the large models are performing exponential takeoff and recursive self-improvement?

Currently a lot of that appears to be marketing hype to drive up usage. Is it exponential, or are the labs spending exponentially more for smaller and smaller gains from LLMs?

bogzz2mo ago

What recursive self-improvement?

1 more reply

franze2mo ago· 2 in thread

hk__22mo ago

Yes, that’s what the author wrote in the second sentence of the post: "AI is here to stay."

irusensei2mo ago

I guess the point is that without the hype subsiding it enshitification will ensue.

general_reveal2mo ago· 2 in thread

HN is no longer a reliable place for the truth. Quite frankly, unless you are utterly self educated, you are terribly vulnerable to this place.

At this rate, I’d almost prefer to talk on a private mailing list with vetted resumes.

rvz2mo ago

> HN is no longer a reliable place for the truth.

"No longer?" It never was.

Especially with AI boosters being allowed to degrade the comments section and shilling their paid blogs and violating the HN guidelines.

1 more reply

myspy2mo ago

Why?

1 more reply

hyperpape2mo ago· 1 in thread

This piece seems poorly thought-out, but well designed to get shared.

Promote writers who will actually explain their claims carefully.

martinvolOP2mo ago

they have to fight to stay competitive because mag7 can outspend them, but my hypothesis is that they wont need to ultimately.

ethagnawl2mo ago· 1 in thread

> If investor money dries up, they will be forced to cut their losses and pass the true costs to their users.

I do not see this talked about often enough whilst everyone is in the process of introducing hard dependencies on these services into their workflows.

senordevnyc2mo ago

KaiserPro2mo ago· 1 in thread

The problem with these kind of posts is that "How" is almost useless, I can tell you how the bubble pops: The value of these AI companies crash and take out a lots of other stuff with it.

The interesting questions are: "What triggers it" and "what also goes tits up"?

The issue with high/international finance is that a good percentage of it (if not more) is fraudulent or semi fraudulent bollocks.

If we look a few year back, NFTs fulfil that niche quite nicely. It was obviously bollocks, but a very convenient way to launder money, or run a series of rugpull operations.

Also, its no guarantee that AI will trigger the next bubble popping, my money is on Private Equity.

martinvolOP2mo ago

> The problem with these kind of posts is that "How" is almost useless, I can tell you how the bubble pops: The value of these AI companies crash and take out a lots of other stuff with it.

That's like saying "I know exactly how you're going to die, your heart will stop"

shubhamjain2mo ago· 1 in thread

JohnTHaller2mo ago

> minus the scandals

Given the tech bros involved, we just don't know about them yet. Also was this comment generated using AI? Look at all the em dashes.

hnthrow02873452mo ago· 1 in thread

lstodd2mo ago

DC infra will be scooped up by cloud guys, that's a given. As for GPUs.. well low-precision tflops have other uses besides inference. You can run Doom for example.

HardCodedBias2mo ago· 1 in thread

In general:

Cynicism makes you sound smart. Optimism makes you successful.

People need to take a measured, reality based, view of how the technology is being used today, the adoption curve, and the increase in capabilities over time.

It's clearly being used strongly, and may even be revolutionary.

Bubbles burst when there's no 'there' there. AI has an undeniable 'there'—the only question is the timing of the ROI.

martinvolOP2mo ago

bubbles are created by people investing more than reasonable in something, independent of the actual value it will generate for society.

lnfromx2mo ago· 1 in thread

martinvolOP2mo ago

2 more replies

elorant2mo ago· 1 in thread

I feel that even if the bubble bursts hardware prices will still take years to normalize. So no clear benefit for the average consumer here.

baggachipz2mo ago

monegator2mo ago· 1 in thread

> How this affects you?

> checks list ...

coffeebeqn2mo ago

2 more replies

richard___2mo ago· 1 in thread

Complete bs.

martinvolOP2mo ago

great feedback

1 more reply

tracker12mo ago

beepbooptheory2mo ago

I just dont understand why it justifies so much spending!

agentultra2mo ago

It sounds like most of the data centers promised in 2025 and 2026 are not even built yet and most of the GPUs bought haven't even been installed.

If it does all go down in flames, even floor value is not going to be that valuable.

I can't predict the future but it's smelling a lot like a recession already under way that is bigger than the sub-prime crash.

mvdtnz2mo ago

skeeter20202mo ago

>> Building a datacenter is supposed to be a “safe” investment in normal times, so banks give private credit and mortgages to finance them.

Lerc2mo ago

The tragedy is when it's all over one of the surviving passengers will go "See! I knew we were going to crash because of that knitter"

HackerThemAll2mo ago

EternalFury2mo ago

If somehow recovering the capex expenditure is not counted, if somehow the cost of developing future models is not counted, then yes, inference costs of current leading models allow a profit.

But those things are tied together.

Even xAI, that now has a reasonably competitive model, is struggling to achieve PMF. Meta is in shambles because their models have underperformed for years now.

titzer2mo ago

I do hope that RAM prices come down but this was just wishful thinking.

positron262mo ago

When will this concern farm end? Internet is ant-milling harder than a model gone psychotic on synthetic data. Call me when it's over.

Back to the mines. The Vulkan only writes itself when prompted with well-conditioned problem statements.

thebeardredis2mo ago

Hopefully soon. My new unwords are f.e. "agentic".

Havoc2mo ago

Gov bailout seems like the only way out.

m12k2mo ago

nexos2mo ago

relation_al2mo ago

RAM's dropping? Woohoo!

LarsDu882mo ago

The world has seen this play out before. Launch a service, sell it at a loss to achieve hypergrowth, raise prices add ads and enshittify.

nickcageinacage2mo ago

AI is shit. I just want this to be over. Can we move on

post-it2mo ago

Cheaper hardware, discounts on stocks, and we keep AI itself? My flavour of hopium, sign me up.

jarek832mo ago

I wonder if AI labs could be bailed out - like banks.

j / k navigate · click thread line to collapse