I have 24GB of VRAM (via a RTX 4090) and run Qwen3.6-35b:iq4, so it's importance-aware quantization and isn't nearly as dumb as it sounds like, fitting the 35b into 18 GB so you have some left over. So far I've had no issues, other than it taking a while for things like image gen, which I found out if you're gonna do with any alacrity, just have a cloud model do it.
For anything else local, including writing some automation scripts and such, it works great.