undefined | Better HN

0 pointsActorNightly3d ago0 comments

The problem is that the hardware is still like $3000. Making anything run on Macs is an exercise in futility. And its a shame that people get duped into buying Macs for LLM inference.

0 comments

zozbot2343d ago

$3000 for running a 397B total parameters model is quite a bargain. The Mac is being used for its access to fast internal storage here since that's the key bottleneck, you could probably achieve similar outcomes with conventional (even fairly low-end) iGPU/APU hardware plus a fast PCIe x4 5.0 SSD (which would also allow you to overlap SSD transfers with iGPU/APU compute), but the cost would also be in a similar range. (Unless you carefully chose low-end e.g. Intel hardware with proper PCIe x4 5.0 NVMe support - which is still quite uncommon, especially for laptops.)

ActorNightlyOP17h ago

If you want to flex on being able to run 397b parameter models at unusably slow tokens/second sure.

You can buy a 3090 for $2k, and run QWEN3.5 at 50+ token a second, and it will do everything you need, especially if you give it enough context.

j / k navigate · click thread line to collapse

0 comments

zozbot2343d ago

ActorNightlyOP17h ago

If you want to flex on being able to run 397b parameter models at unusably slow tokens/second sure.

You can buy a 3090 for $2k, and run QWEN3.5 at 50+ token a second, and it will do everything you need, especially if you give it enough context.

j / k navigate · click thread line to collapse