undefined | Better HN

0 pointsdiseasedyak4h ago0 comments

I have 24GB of VRAM (via a RTX 4090) and run Qwen3.6-35b:iq4, so it's importance-aware quantization and isn't nearly as dumb as it sounds like, fitting the 35b into 18 GB so you have some left over. So far I've had no issues, other than it taking a while for things like image gen, which I found out if you're gonna do with any alacrity, just have a cloud model do it.

For anything else local, including writing some automation scripts and such, it works great.

0 comments

3 comments · 2 top-level

Zambyte4h ago· 1 in thread

Can you link the model? I also have a 24gb card (7900 XTX). I've been having success with the dense 27b model, but I'd like to see if the 35b iq4 is any better.

jboss104h ago

https://unsloth.ai/docs/models/qwen3.6 And https://huggingface.co/collections/unsloth/qwen36

ai_fry_ur_brain4h ago

Whats your example of a "great automation script"?

j / k navigate · click thread line to collapse