undefined | Better HN

0 pointsein0p2y ago0 comments

Well, I should have said “it wouldn’t be too difficult for me” then. I keep forgetting why I get paid so much.

0 comments

2 comments · 1 top-level

VirusNewbie2y ago· 1 in thread

I would love for you to expound, I found it interesting that you qualified your "should you bother, no" with "unless you are doing inference at scale". But in the previous paragraph you explained why you can get better performance with GPUs.

So is there some advantage of TPU, assuming there was SWE/API parity between GPUs?

ein0pOP2y ago

Could be cheaper, depending on workload, and if you’re large that could justify the cost of additional SWE time required to port and support. Triton/CUDA requires people who know both DL and low level programming. Whether you get better performance _per dollar_ really depends on workload and also on the size of your workload. Here I don’t just mean the cost of buying compute in cloud, I mean the more broad definition: total cost of doing business, all in, including SWE cost. If you’re huge (eg Anthropic), SWE cost at scale is a lot easier to justify. If you’re on the smaller side, SWE cost matters a lot more. It’s way easier to hire PyTorch people (market share 60%) than eg Jax (market share 3%). And yeah I know there’s Torch XLA, but it’s basically the same thing with a different frontend.

j / k navigate · click thread line to collapse