undefined | Better HN

0 pointsGaryBluto2mo ago0 comments

Luckily local AI is becoming more feasible every day.

0 comments

8 comments · 8 top-level

It feels more and more like OpenAI/Anthoropic aren't the future but Qwen, Kimi, or Deepseek are. You can run them locally, but that isn't really the point, it is about democratization of service providers. You can run any of them on a dozen providers with different trade-offs/offerings OR locally.

They won't ever be SOTA due to money, but "last year's SOTA" when it costs 1/4 or less, may be good enough. More quantity, more flexibility, at lower edge quality. It can make sense. A 7% dumber agent TEAM Vs. a single objectively superior super-agent.

That's the most exciting thing going on in that space. New workflows opening up not due to intelligence improvements but cost improvements for "good enough" intelligence.

2 more replies

fourside2mo ago

Maybe for folks who are deep into this, but it’s not exactly accessible. I tried reading up on it a couple of months ago, but parsing through what hardware I needed, the model and how to configure it (model size vs quantization), how I’d get access to the hardware (which for decent results in coding, new hardware runs $4k-$10k last I checked)—it had a non trivial barrier of entry. I was trying to do this over a long weekend and ran out of time. I’ll have to look into it again because having the local option would be great.

Edit: the replies to my comment are great examples of what I’m talking about when I say it’s hard to determine what hardware I’d need :).

3 more replies

nozzlegear2mo ago

I've been using local AI via LM Studio ever since I canceled my Claude subscription. It's obviously slower than Claude on my M1 Studio[†], but like someone else said, I use AI more like a copilot than an autopilot. I'm pretty enthused that I can give it a small task and let it churn through it for a few minutes, while I work on something alongside – all for free with no goddamned arbitrary limits.

[†] The latest Qwen 3.6 whatever has been a noticeable improvement, and I'm not even at the point where I tweak settings like sampling, temperature, etc. No idea what that stuff does, I just use the staff picks in LM Studio and customize the system prompts.

ModernMech2mo ago

I love how it's just a tacit understanding that these companies' entire MO is to carve out a territory, get everyone hooked on the good stuff and then jack up the price when they're addicted and captured -- literally the business plan of crack dealers, and it's just business as usual in the tech industry.

2 more replies

politelemon2mo ago

Feasibility on commodity hardware would be the true watermark. Running high end computers is the only way to get decent results at the moment, but if we can run inference on CPUs, NPUs, and GPUs on everyday hardware, the moat should disappear.

1 more reply

aleqs2mo ago

Indeed, I feel like we are in the early computer equivalent phase of AI, where giant expensive hardware is still required for frontier models. In 5 years I bet there will be fully open models we'll be able to run on a few $1000 of consumer hardware with equivalent performance to opus 4.7/4.6.

1 more reply

andyfilms12mo ago

Sure, but local AI is still a black box. They can be influenced by training data selection, poisoning, hidden system prompts, etc. That recent Wordpress supply chain hack goes to show that the rug can still be pulled even if the software is FOSS.

root_axis2mo ago

Not really. The hardware requirements remain indefinitely out of reach.

Yes, it's possible to run tiny quantized models, but you're working with extremely small context windows and tons of hallucinations. It's fun to play with them, but they're not at all practical.

1 more reply

j / k navigate · click thread line to collapse

0 comments

8 comments · 8 top-level

Someone12342mo ago

That's the most exciting thing going on in that space. New workflows opening up not due to intelligence improvements but cost improvements for "good enough" intelligence.

2 more replies

fourside2mo ago

Edit: the replies to my comment are great examples of what I’m talking about when I say it’s hard to determine what hardware I’d need :).

3 more replies

nozzlegear2mo ago

ModernMech2mo ago

2 more replies

politelemon2mo ago

1 more reply

aleqs2mo ago

1 more reply

andyfilms12mo ago

root_axis2mo ago

Not really. The hardware requirements remain indefinitely out of reach.

Yes, it's possible to run tiny quantized models, but you're working with extremely small context windows and tons of hallucinations. It's fun to play with them, but they're not at all practical.

1 more reply

j / k navigate · click thread line to collapse