Sorry if you guys get so overwhelmed with deepseek submissions these days. This will be my one and only in the next time. It is cool to have an anti-weight to all these pay models.
Personally I don't get sick of it. There's a lot of hype around Deepseek specifically rn, but to run SOTA or near SOTA models locally is a huge deal, even if it's slow.
The issue is that this article is conflating (as do many, many articles about the topic) the distilled versions of R1 (basically llama/qwen reasoning finetunes) with the real thing. We are not even talking about quantized versions of R1 here, so it's not quite accurate to say you're running R1 here.