undefined | Better HN

0 pointsKeplerBoy16d ago0 comments

Xeons have a much longer shelf life and diverse workloads. If you order hardware specifically for LLM inference and then some new hardware/model combination is much better at that (which it will be, because a lot of people are working on that), you might be in trouble.

It's like setting up a warehouse of GPUs to mine bitcoin while others are switching to ASICs.

0 comments

trick-or-treat16d ago

Training you mean. Doing inference on last year's chip is probably ok, but training a frontier model on it is going to be a deal breaker.

KeplerBoyOP16d ago

No I mean inference. The idea is that inference demand will be massive and a race to the bottom with razor thin margins.

Training costs can be amortized over the entire lifetime of the model, but if you lose money on inference or can't offer competitive usage limits for subscribers, there's no amortizing that.

trick-or-treat15d ago

No it's all about having the top model first and training time is what's crucial. OpenAI has already shown willingness to bleed money for the sake of brand and we can expect that to continue.

1 more reply

j / k navigate · click thread line to collapse