Thanks for clarifying this. Could you clarify whether your chip supports the transformer architecture in general, or only specific models for e.g. Llama 70B? In case of the latter, would your ASIC have to be reprogrammed for each model?
This seems to be a novel definition of "smarter" - one could also argue that the printed answer key for a standardized test is smarter than most humans.
I made a request to access their developer cloud. Anyone have any idea when they start processing those requests and how many slots they might have?
The more the merrier. If we do need power plants worth of juice then it may as well be dedicated custom hardware to optimise this to the max
I mean, are they? Seems like the industry would prefer these things to become commodities, especially if it helps with portability and reproducibility.
Of course GPUs (and CPUs, and pre-transformer/LLM type NPU/TPUs) are also going to respond to better suit LLM workloads. So we may see a convergence in capabilities.
What is for sure is that the LLM chip architecture is not yet settled. Currently available chips are mostly designed pre LLM craze, with slight adaptations. And have large areas of potential improvement, at least 10-100x (maybe 1000x) in power efficiency. And power efficiency is a key element (in addition to up from hardware investment). Which is the opportunity this startup (and others) have identified.