undefined | Better HN

0 pointsfrozenport1y ago0 comments

Well I've been using the groq public api, and its approx. the rates claimed.

Economics and costs are hard to predict. For example, Groq is not using HBM chips. So probably the cards are a lot easier to source.

Its not clear what the capacity of these systems are in terms of total users, or even tokens per second. Then you factor in cost. Then you realize all vendors will match a competitors pricing. Then you realize Groq doesn't sell chips.

¯\_(ツ)_/¯

The only thing you have is the public API to benchmark against: https://artificialanalysis.ai/

0 comments

snhbsqub1y ago

- Groq has exactly 0 dollars in revenue - Groq requires 576 chips to run a single model - Groq can do low latency inference, but can't handle batches, and can't run a diversity of different models on each deployment - Groq quantizes the models, significantly affecting quality to get more speed (and don't communicate this to end users, which is very deceptive) - Groq can only run inference, cannot train on their systems

- SambaNova has real revenue from big customers - SambaNova can run any model on a single node at the speed Groq requires - SambaNova can do low latency inference just like Groq, but can also run large batches and host hundreds of models on a single deployment - SambaNova does not quantize models unless explicitly stated - SambaNova can run training at perf competitive with Nvidia, as well as fastest inference in the world at full precision

It really isn't a competition. Groq has done great as garnering hype in recent months, but it is a house of cards.

frozenportOP1y ago

I think semi analysis commented that they have pipelines instead of batches[1].

So every clock cycle you're doing useful work rather than loading up people into batches. And thats why the arch will probably win for inference, for training you're basically competing with software eco system and silicon density. AKA NVIDIA can give TSMC more money to get more ALUs on the die.

I think other places have attempted dataflow (FPGA etc) but they all basically had buffers (due to non-determinism in networks stack and even ram). SambaNova seems indistinguishable from an FPGA with a few clock cycles difference. I think they blew their shot with a Series D ($600 million???) where they made more of the same old. Maybe Intel will buy them to augment Altera? Looks like chasing parity with existing strategies.

I buy the Groq hype because its something different, certainly the public demo helped. HN is about the future.

[1] https://www.semianalysis.com/p/groq-inference-tokenomics-spe...

j / k navigate · click thread line to collapse