undefined | Better HN

0 pointsjbarrow1y ago0 comments

I’m really curious what training is going to be like on it, though. If it’s good, then absolutely! :)

But it seems more aimed at inference from what I’ve read?

0 comments

I was wondering the same thing. Training is much more memory-intensive so the usual low memory of consumer GPUs is a big issue. But with 128GB of unified memory the Digits machine seems promising. I bet there are some other limitations that make training not viable on it.

jbarrowOP1y ago

Primarily concerned about the memory bandwidth for training.

Though I think I've been able to max out my M2 when using the MacBook's integrated memory with MLX, so maybe that won't be an issue.

ryao1y ago

Training is compute bound, not memory bandwidth bound. That is how Cerebras is able to do training with external DRAM that only has 150GB/sec memory bandwidth.

1 more reply

tpm1y ago

It will only have 1/40 performance of BH200, so really not enough for training.

j / k navigate · click thread line to collapse