undefined | Better HN

0 pointsdaemonologist1y ago0 comments

The quantized model fits in about 20 GB, so 32 would probably be sufficient unless you want to use the full context length (long inputs and/or lots of reasoning). 48 should be plenty.

0 comments

3 comments · 1 top-level

manmal1y ago· 2 in thread

I‘ve tried the very early Q4 mlx release on an M1 Max 32GB (LM Studio @ default settings), and have run into severe issues. For the coding tasks I gave it, it froze before it was done with reasoning. I guess I should limit context size. I do love what I‘m seeing though, the output reads very similar to R1, and I mostly agree with its conclusions. The Q8 version has to be way better even.

whitehexagon1y ago

Does the Q8 fit within your 32GB (also using an M1 32GB)

manmal1y ago

No, Q4 just barely fits, and with a longer context sometimes things freeze. I definitely have to close Xcode.

j / k navigate · click thread line to collapse