We said the same thing about Waymo, that it was perpetually in the future. It took them less than a decade. The robots today are functionally capable, they don’t have the right fuzzy intelligence yet. It’s purely a data problem (lack of) and a lot of people are working on it.
It's not just a data problem, it's a hardware problem. Transformer-based robots require even more processing power than plain LLMs, as they also need to process visual and spatial/touch input. We don't have GPUs capable of fast inference on a SOTA LLM that would fit in a robot brain form factor, let alone also run fast enough spatial and visual processing. And there's currently nothing even approaching a feasible solution for cooling such a device.