Because that wasn't what they claimed to research?
>> for inference it's definitely not worth it.
It's entirely fine if you enjoy local LLMs on your computer, there are people doing horribly inefficient inference on smartphones now. But for pure inference tasks, it's pretty obvious why M5s and Mac Studios aren't replacing TPUs and GPUs.