I can't speak for the US, but in Germany (where hardware is usually more expensive, not less), I got my 3090 3 months ago for 750 euro and have been running the iq4_nl 27B using q4 kv (which after recent patches in llama.cpp is in my xp indistinguishably accurate from q8 of f16) at full ctx, with MTP at 2, peaking around 70 t/s on small ctx, around 50 t/s when im around 64k and ends around 40 t/s near the cap. The rest of the PC is a 50 euro ddr3 16gb i5 4th gen box, absolutely nothing special. And this setup is often more useful than dsv4pro (and sometimes kimi, but not glm) for research and ML work.