I want to make a comparison with a car rental business and say that it would be like valuing Hertz entirely on the basis of the number of cars they own, as opposed to how many they rent out, but cars have a much longer depreciation period, if there are no customers they’re not costing you more money, unlike your computer which you are using for training and sucking up massive amounts of energy, and those cars do maintain decent value even after they’re of little use to the car rental company, unlike the compute here.
That's the default assumption but in the new GPU+Memory constrained age isn't true.
Time on 4 year old H100 servers costs more now than when they were new (!!)
Is it an age or a temporary situation?
The GPU shortage looks to be even longer lived.
The key question is on direction of LLMs. Right now, LLMs are taking over human jobs. If the cost of silicon+power < cost of human being doing the same work, what rational reason is there to employ a human being?
If this applies to SWEs, lawyers, business analysts, many research scientists, .... this situation could persist for a long, long time. While capital costs less than the inputs of labor (nominal food, housing, etc.), there is no need for labor.
The key question is about continued progress in models, and of the tooling around them:
- Plateau: Old silicon obsoletes in due course
- Rise quickly: Old silicon maintains value for a long time
I assumed the latter and therefore that the memory is depreciating along with the GPU cores it's soldered onto PCBs with.
... or is it a different argument being made, perhaps that depreciation for GPUs has slowed because rising demand will keep them in service longer?
There are several confounding factors.
We’ve seen massive inflation since then. So some growth in cost was expected.
More importantly, the current Tech industry almost always starts by selling things at a loss. The increased cost could simply be the industry choosing to not subsidize that particular service anymore.
But also, I don’t think that’s a realistic comparison. Rented out GPUs are likely not a similar use profile as compute used for training LLMs. The latter is likely closer to the cryptocurrency GPUs that are running at full tilt 24/7.
And those things physically burn out.
This is untrue.
H100's are used for training (well were, but are now outdated because B100/B200s are much faster).
Most of the reason people rent H100s is for smaller training runs.
If you are doing inference you usually buy managed capacity at Baseten or something, and that is often priced differently (although it comes down to an extra margin on longer term H100 prices basically).
Inference utilization is often actually higher than training now because so much effort has been spent on optimizing that stack.
What I am wondering though is how long can you run such a system at basically full load without interruption before it starts to just physically degrade.
If I have a H100 and I let it run for 4 years at full throttle does it still have the same theoretical value as it had at the start or are the chips just burning out.
I think I remember that back when the cards used for crypto mining were sold en masse on ebay the advice was to stay away from them because they are more likely to fail?
Temperature is a big factor, as well as current density.
But there's also the # and magnitude of thermal cycles (which translate into mechanical stress, leading to metal-fatigue like effects on contact points etc), attack from chemicals in the air, cosmic radiation, ESD damage & more. Some may matter, some not.
That's why "new" > "used" in case of electronics. Especially since you don't know the (ab)use history of used parts.
That's because the rate of improvement in silicon manufacturing has been continually declining for a few decades, which has a compounding effect. Just compare the technological improvements in successive decades. 1976->1986->1996->2006->2016->2026.
That's why "in real terms" performance has only been very slowly improving if you compare apples to apples (and not e.g. apples to oranges by reducing precision, like nvidia tends to do, or by comparing chips with x W to an MCM with x*2 W and saying the latter is much faster). The "just halve the number of bits in each generation" strategy has also run out now, there's no more bits to halve.
Let's not mix up depreciation of real value vs USD price (which is arbitrary, plus government controlled)
There's a reason old 3090's went from $600 in 2022 o to over $1K in 2026.
How someone can look at an asset class thats appreciated an order of magnitude in the last two years and say it will depreciate in value when the tailwinds are even stronger now is beyond me.
At some point the market will be saturated with supply and prices will come down for older gen hardware. It can take years though, but it happened to fiber cable and fiber doesn't even depreciate like chips.
The same argument you’ve made would work for tulip bulbs, dotcom prices, or whatever. Prices go up until they don’t. Exponentials don’t last forever and the intrinsics of technology assets depreciate: things wear out and are also replaced with better things.
* except ram
The frontier labs are shifting from pricing grounded in the price of compute, to pricing grounded in the intelligence provided, or more specifically the economic value of that intelligence downstream.
The margins on that allow them to pay a hefty premium on compute and still come out ahead.
As they buy more compute at high prices, they're also pricing out competition from cheaper models. It's already become materially more difficult to get compute to run open weight models at competitive prices as a result of frontier labs in the last year.
Opus 4.7 has all the signs of a smaller model distilled from a newer pretraining run... except a smaller price.
Flash 3.5 raised in price pretty meaningfully over Flash 3
GPT 5.4 got a small price bump over gpt-5.3-Codex/gpt-5.2, then gpt-5.5 doubled pricing over gpt-5.4
Even open weights isn't immune: Kimi K2.6 was originally priced higher despite openly being 2.5 + more post-training, same with GLM 5.1 vs 5
-
All while rental prices are spiking month over month, and NVIDIA Inception discounted prices for buying are higher than undiscounted prices for buying 6 months ago...
In the medium term, everyone ramps up production. Huawei and other Chinese companies work really hard to develop in-house alternatives. At some point, the hype cycle will peak and less money will flow into datacentres (yes, this will happen. It always does. Even for technologies that change society. The bubble always bursts).
The question is not if this will happen. It will happen. It's just a question of when it happens and how big the magnitude of the cycle is.
Same with GPUs. There is also a huge market for used GPUs from 1-2 generations ago. The A100 is a six year old chip at this point and is still running strong, especially for inference. Like cars, chips can be refurbished and repaired. A hyperscaler or even mid level player here isn't going to hold onto chips for their entire usable lifespan.
So are you using the computers or not? I'd argue that if you're using them for training, then it's not wasted capacity. And if you're not using them, then you can turn them off, so you're not sucking up energy.
I don’t know but this dude at my son’s school has a 32GB RTX 5090 and it’s worth more than what he paid for; and he did the same trick with the RTX 4090 before that.
Until shortages are the rule, these assets are appreciating
There is depreciation, which is taking the purchase price and dividing it across N number of years (typically 5). That's the D in EBITDA and is mostly used as a profitability calculation.
The depreciation of a GPU also gets mucked up in the current GPU financed market as well. DDTL loans. The people running the GPUs often don't even own the GPU, they lease it, so there is nothing for them to depreciate (D).
The analogy that a GPU is like a used car makes zero sense. There is no oil or tires to change on a GPU. They don't wear out in the same way that a rental car would. They are housed in climate controlled locations with clean power. They just don't fail the way that is portrayed in the press.
Useful life of a GPU is based on profitability. When does opex cost more than profitability?
Some companies, like mine, also have support contracts. Anything goes wrong with the GPU (or any part of the system), Dell comes and fixes it at no extra charge. We just migrate customers and workloads to hot spares while the parts are replaced.
As for compute going down in value... the 122TB of enterprise nvme and 2GB of ram in each server that I bought 2 years ago is now worth vastly more than I paid for it. I'm also renting my GPUs out for more money now due to supply being so tight and demand being so high.
the comment you replied to is word-by-word what people hyping canadian telecoms were saying before the dotcom crash!