undefined | Better HN

0 pointsac292mo ago0 comments

> So say someone built an under $10k system, with perhaps dual RTX 5090. That same system will be able to easily run 20 parallel requests. The only cost is electricity. You can run it 24/7. For 1 year, that's ~$6million

I dont see how you get anywhere close to $6M of tokens out of a pair of 5090s. The class of model they could run is fairly small and extremely cheap to run via API (my math says running Gemma4-31B for 24 hours costs less than $1 on OpenRouter). Even with 20x concurrent requests you are orders of magnitude away from $6M/yr.

0 comments

1 comments · 1 top-level

segmondy2mo ago

I never said that, my point is that paying 20 people at $35 for 24/7 is about $6 million. You can replace that with a $10k system running 20 parallel requests for a year and save lots of money.

j / k navigate · click thread line to collapse