undefined | Better HN

0 pointsblackeyeblitzar2y ago0 comments

I can understand the inference part being useful and practical for Apple devs. I’m just wondering about the training part, for which there Apple silicon devices don’t seem very useful.

0 comments

spmurrayzzz2y ago

My M2 Max significantly outperforms my 3090 Ti for training a Mistral-7B LoRA. Its sort of a case-by-case situation though, as it depends on how optimized the CUDA kernels happen to be for whatever workload you're doing (i.e. for inference, theres a big delta between standard transformers vs exllamav2, apple silicon may outperform the former, but certainly not the latter).

rgbrgb2y ago

I’ve seen several people fine tune mistral 7B on MacBooks.

j / k navigate · click thread line to collapse

0 comments

spmurrayzzz2y ago

rgbrgb2y ago

I’ve seen several people fine tune mistral 7B on MacBooks.

j / k navigate · click thread line to collapse