But you need cpus with the highest number of chiplets because the memory controller to chiplet interconnect is the (memory bandwidth) limiting factor there. And those are of course the most expensive ones. And then it's still much slower than gpus for llm inference, but at least you have enough memory.