Fast, but stupid.
Me: "How many r's in strawberry?"
Jimmy: There are 2 r's in "strawberry".
Generated in 0.001s • 17,825 tok/s
The question is not about how fast it is. The real question(s) are:
1. How is this worth it over diffusion LLMs (No mention of diffusion LLMs at all in this thread)
(This also assumes that diffusion LLMs will get faster)
2. Will Talaas also work with reasoning models, especially those that are beyond 100B parameters and with the output being correct?
3. How long will it take to create newer models to be turned into silicon? (This industry moves faster than Talaas.)
4. How does this work when one needs to fine-tune the model, but still benefit from the speed advantages?