I was wondering the same thing. Training is much more memory-intensive so the usual low memory of consumer GPUs is a big issue. But with 128GB of unified memory the Digits machine seems promising. I bet there are some other limitations that make training not viable on it.
Training is compute bound, not memory bandwidth bound. That is how Cerebras is able to do training with external DRAM that only has 150GB/sec memory bandwidth.