undefined | Better HN

0 pointsmrinterweb2mo ago0 comments

As far as I understand all the inference purpose-build silicon out there is not being sold to competitors and kept in-house. Google's TPU, Amazon's Inferentia (horrible name), Microsoft's Maia, Meta's MTIA. It seems that custom inference silicon is a huge part of the AI game. I doubt GPU-based inference will be relevant/competitive soon.

0 comments

nightshift12mo ago

According to this semianalysis article, the Google/Broadcom TPU are being sold to others like Anthropic.

https://newsletter.semianalysis.com/p/tpuv7-google-takes-a-s...

nomel2mo ago

> It seems that custom inference silicon is a huge part of the AI game.

Is there any public info about % inference on custom vs GPU, for these companies?

mrinterwebOP2mo ago

Gemini is likely the most widely used gen AI model in the world considering search, Android integration, and countless other integrations into the Google ecosystem. Gemini runs on their custom TPU chips. So I would say a large portion of inference is already using ASIC. https://cloud.google.com/tpu

almostgotcaught2mo ago

[flagged]

mrinterwebOP2mo ago

Soon was wrong. I should have said it is already happening. Google Gemini already uses their own TPU chips. Nvidia just dropped $20B to buy the IP for Groq's LPU (custom silicon for inference). $20B says Nvidia sees the writing on the wall for GPU-based inference. https://www.tomshardware.com/tech-industry/semiconductors/nv...

almostgotcaught2mo ago

There are so many people on here that are outsiders commenting way out of their depth:

> Google Gemini already uses their own TPU chips

Google has been using TPUs in prod for like a decade.

j / k navigate · click thread line to collapse

0 comments

nightshift12mo ago

According to this semianalysis article, the Google/Broadcom TPU are being sold to others like Anthropic.

https://newsletter.semianalysis.com/p/tpuv7-google-takes-a-s...

nomel2mo ago

> It seems that custom inference silicon is a huge part of the AI game.

Is there any public info about % inference on custom vs GPU, for these companies?

mrinterwebOP2mo ago

almostgotcaught2mo ago

[flagged]

mrinterwebOP2mo ago

almostgotcaught2mo ago

There are so many people on here that are outsiders commenting way out of their depth:

> Google Gemini already uses their own TPU chips

Google has been using TPUs in prod for like a decade.

j / k navigate · click thread line to collapse