undefined | Better HN

0 pointspmontra2d ago0 comments

Agreed but I want to see how it plays out. Historically a good Windows computer cost $1000 and it was all it took to start programming. How much does it cost a computer with enough resources to run a good enough AI model for agentic workflows and a reasonable time to first token? Can "most of the world" afford buying one?

0 comments

39 comments · 18 top-level

wizee2d ago· 8 in thread

Qwen 3.6 27B is quite good for agentic coding, and practical to run on consumer hardware. You need a system with either 32+ GB VRAM, or a unified memory system with 48+ GB VRAM and a decent integrated GPU. While not cheap, such a setup is still attainable for much of the world, and will eventually get cheaper over time. Open models hosted on non-American clouds also remain an option with a much lower barrier to entry, for cases where privacy is less critical.

jochem92d ago

There was an article on HN a few weeks ago where someone detailed how they managed to get an old datacenter GPU to run in their consumer PC, getting decent performance with qwen. He spent something like $200 on the GPU (second hand of course).

So yeah, I think models on local hardware will be quite common soon among the tech savvy (such as people creating software).

wrs2d ago

Especially considering the millions of 2026-class data center GPUs that massively overinvested companies are currently buying, which will be obsolete in a few years.

4 more replies

schmuhblaster2d ago

Indeed, and with some tinkering around the harness it can even punch way above its weight.

yowlingcat2d ago

I've seen folks make it work with a 3090 on 4 bit quant using turboquant for KV cache. That's key because 3090s remain the most cost effective gpu metal for enthusiasts (albeit 24g) and the jump to 5090 (32g) is quite expensive and not always worth the LLM specific performance; sadly, good 32g metal is somewhat lacking in the price point at or above the 3090.

thewebguyd2d ago

> You need a system with either 32+ GB VRAM

I do hope you're right that it will get cheaper over time (it should), but right now 32GB of VRAM is not affordable to a lot of people. You're talking ~$4500 just for the GPU, or $800 ish used if you can find one.

daan-k2d ago

For inference you can split the 32GB between two 16GB cards. Two new 5060tis for ~€1000 in total is more than fine.

It's a tad less efficient and a bit more of a hassle, but still a good experience for only a fraction of the price.

1 more reply

FloatArtifact1d ago

Intel Arc Pro B70 32gb for $999ish https://www.newegg.com/intel-arc-pro-b70-32gb-graphics-card/...

gleenn2d ago

A Mac laptop can be had with 32GB of RAM for far less than $4500. Not sure if they actually need 32GB of discreet GPU RAM. My Mac laptop does run Qwen at a reasonable speed.

Chu4eeno2d ago· 3 in thread

Open weights/source doesn't necessarily mean running on local hardware, though.

I imagine having multiple providers competing will drive down hosted versions of open weight models drastically.

mncharity1d ago

> Open weights/source doesn't necessarily mean running on local hardware, though.

And we've barely started to scratch the surface on helping open-weight models "be the best they can be", with cloud burst parallel sampling and prompt mutation. Looking for best probabilistic results for a prompt, and looking for best prompt variants for a task. Adaptively scaling computes at generation, not just training.

And speculatively, if agentic coding is naturally a multiplicity, what UX might enable human devs to dance with that quantum superposition? Rather than quickly collapsing to one monkey and its keyboard.

Tuna-Fish2d ago

You are describing OpenRouter. And yes, it does.

Chu4eeno2d ago

Well, open router is a bit more general, since it just proxies requests to both proprietary and open weights models.

I was thinking more of the providers of inference on open weight models that openrouter proxies to.

giancarlostoro2d ago· 2 in thread

Before the AI "crisis" it used to take about $3500 to get a prebuilt with a 5090 which can run good enough LLMs. I run reasonable LLMs on just 16GB of VRAM on my Mac, and the 5090 has double that.

dgellow2d ago

What model (and quant) do you run with 16GB? I assume you have a 24GB model, and dedicate 16GB to the model?

giancarlostoro1d ago

Quite the opposite, you want something below your max VRAM and you use up to your max VRAM for the context window, so it can run longer.

https://huggingface.co/Jackrong/Qwen3.5-27B-Claude-4.6-Opus-...

1 more reply

abetusk2d ago· 2 in thread

Moore's law or one of its generalizations still holds, so it will only be a short matter of time before a $1k computer will be able to train and run a powerful enough model.

Windchaser2d ago

I thought Moore's Law came to an end in the last decade?

Certainly the transistors/chip or transistors/$ or flops/$ have not been progressing at the same exponential rate as during 1970-2010. There is still progress, but it's rather slower.

abetusk2d ago

I was careful to say "Moore's law-like". Moore's law is stated as "transistor count per area doubles about every 2 years." [0] As stated, then yes, this might be true but, while important, that's not really the quantity we care about.

As you point out it's really cost per transistor or cost per flop that we mostly care about. I'm finding it hard to find a succinct and clear plot, but I believe one is provided by Our World In Data on "GPU computational performance per dollar" [1] which, to my eyes, clearly shows exponential growth in computational power per dollar.

The picture for storage is a little more muddied but if you squint just right you might still be able to recover an uninterrupted exponential growth [2].

In my view, it's pretty clear that advances in AI have progressed so quickly because GPUs have been keeping up with the exponential growth of computational power (per unit cost).

Exponential growth in this area is usually characterized by "S-curves", where one technology gets saturated but the exponential increase in power or decrease in cost is picked up by another, adjacent, technology, that allows the growth to continue. For compute it's CPUs to GPUs. For storage it's platter drives that are now being overtaken by SSDs.

The more general phenomena is called Wright's law, or experience curve effects [3].

[0] https://en.wikipedia.org/wiki/Moore%27s_law

[1] https://ourworldindata.org/grapher/gpu-price-performance?ySc...

[2] https://ourworldindata.org/grapher/historical-cost-of-comput...

[3] https://en.wikipedia.org/wiki/Experience_curve_effect

mbgerring2d ago· 2 in thread

About $2k in 2026 dollars and falling.

simonw2d ago

... or rising, at least as long as there's a RAM shortage.

mbgerring2d ago

I’d bet that there won’t be a RAM shortage for very long.

2 more replies

majormajor2d ago· 1 in thread

> Historically a good Windows computer cost $1000 and it was all it took to start programming.

Gotta remember inflation here.

$1K in 1995 was roughly equivalent to $2K now and wouldn't have been a particularly "good" machine then.

In 1982 the Commodore 64 started at about $600 bucks, also roughly around $2K today.

If you outgrew that, beefier machines back then were A LOT. It was easy to find $2k+ towers and (especially) laptops even into the 2000s, and a lot of those would be $5K+ equivalent today.

SoftTalker2d ago

And a unix workstation in those days could be high 4 or even 5 figures, depending on configuration.

ssivark2d ago· 1 in thread

I don't understand the justification for local hardware with cost as the motivation. The same (or bigger/better) open weights models can served by third parties at much higher resource utilisation, and will therefore be much cheaper!?

Especially because the world is likely to persist, at least for a while, in state where computing hardware demand drastically exceeds supply resulting in high prices for hardware. So why wouldn't you want to max out utilisation and amortize costs, at least for typical (non sensitive) use cases.

layer82d ago

IMO the more useful distinction is in analogy to VPS versus SaaS/PaaS. Open models allow you to use any inference provider you like, including local ones, similar to running open-source software using VPS providers. You’re not bound to a particular SaaS/PaaS as you are with closed model providers. That same freedom also allows you to self-host when you care about that.

bensyverson2d ago· 1 in thread

Yes, between Moore's Law and more efficient model architectures, we just have to let time do its work.

Danox2d ago

Software models and hardware are getting better all the time—and that’s where some big companies spending billions might stumble! In fact, Microsoft recently announced that they’re scaling back a bit on their AI investments.

skydhash2d ago· 1 in thread

> Historically a good Windows computer cost $1000 and it was all it took to start programming

Started with computers around 2009 and later bought an oldish computer (a pentium 4 PC) for the equivalent of 50 usd. Codeblocks and Python Idle were free at the time (C and Python were the first languages I learned). The barrier to programming has always been low as the only thing you needed was books (the internet made things easier) and access to a PC (I had friends with laptop and my school lab).

Supermancho2d ago

In 1998 I had a second PC for the first time in my life. It was an old i486DX2-66 that my dad gave to me. He said it cost him $50 secondhand and was bought so I could put Linux on it.

onel1d ago

Even if you go with an open source AI model, it will still make a lot more financial sense to pay a provider then actually invest in the hardware to self-host it.

It's definitely worth investing in self-hosting the agent infrastructure around the model though: all the documents, knowledge base, all the connectors, the agent itself to run on your hardware

orbital-decay1d ago

How far back are we talking? Historically there were just a few computers in the world, personal computing revolution happened much later. And a private inference rig costs roughly the same in 2026 as a personal computer did in the early 80's, if you want the performance comparable with a personal workstation of that time (vs a mainframe/supercomputer).

crazycracker2d ago

Historically the cost of compute has also gone down. Like just look at it as compared to a year ago. We have amazing open source models that can run on consumer hardware and if we go away from our obsession of using opus 4.8 or mythos for everything then it actually is super amazing to see what these open source models could do. I use qwen3.6:27b as a daily driver and I am heavily impressed with it.

nicoburns2d ago

I think AI is still in it's "mainframe era". Once upon a time, computers that ran useful workloads filled a whole room. And now that's true of computers that run (state of the art) AI models. My guess would be that this won't always be the case, although I certainly don't claim to have a clear idea of how things will play out.

Kim_Bruning2d ago

Roughly about Eur 3-4K right this minute I think? The graphics card, ram and storage are punishing. Under more normal circumstances (hopefully late 2027) it'd be 1500-2500 depending on what you think is realistically useful.

Possibly it's the same price range, allowing for inflation.

Thraway1981d ago

With the free tiers available now you can make your own document processors and other such office software, in one pass.

rayiner2d ago

Isn’t this just a bet that I’ll have an AI data center in my iPhone within 10 years? Why is that a bad bet?

epolanski2d ago

If we had 256-512 GB ram unified memories at 2022 prices, we'd be talking 1500 computers.

ktallett2d ago

Hence why brute force needs to be replaced with examples such as neuromorphic methods. It could realistically could be combined with mesh networking as well to utilise the capabilities of all computers locally.

j / k navigate · click thread line to collapse

0 comments

39 comments · 18 top-level

wizee2d ago· 8 in thread

jochem92d ago

So yeah, I think models on local hardware will be quite common soon among the tech savvy (such as people creating software).

wrs2d ago

Especially considering the millions of 2026-class data center GPUs that massively overinvested companies are currently buying, which will be obsolete in a few years.

4 more replies

schmuhblaster2d ago

Indeed, and with some tinkering around the harness it can even punch way above its weight.

yowlingcat2d ago

thewebguyd2d ago

> You need a system with either 32+ GB VRAM

daan-k2d ago

For inference you can split the 32GB between two 16GB cards. Two new 5060tis for ~€1000 in total is more than fine.

It's a tad less efficient and a bit more of a hassle, but still a good experience for only a fraction of the price.

1 more reply

FloatArtifact1d ago

Intel Arc Pro B70 32gb for $999ish https://www.newegg.com/intel-arc-pro-b70-32gb-graphics-card/...

gleenn2d ago

A Mac laptop can be had with 32GB of RAM for far less than $4500. Not sure if they actually need 32GB of discreet GPU RAM. My Mac laptop does run Qwen at a reasonable speed.

Chu4eeno2d ago· 3 in thread

Open weights/source doesn't necessarily mean running on local hardware, though.

I imagine having multiple providers competing will drive down hosted versions of open weight models drastically.

mncharity1d ago

> Open weights/source doesn't necessarily mean running on local hardware, though.

Tuna-Fish2d ago

You are describing OpenRouter. And yes, it does.

Chu4eeno2d ago

Well, open router is a bit more general, since it just proxies requests to both proprietary and open weights models.

I was thinking more of the providers of inference on open weight models that openrouter proxies to.

giancarlostoro2d ago· 2 in thread

Before the AI "crisis" it used to take about $3500 to get a prebuilt with a 5090 which can run good enough LLMs. I run reasonable LLMs on just 16GB of VRAM on my Mac, and the 5090 has double that.

dgellow2d ago

What model (and quant) do you run with 16GB? I assume you have a 24GB model, and dedicate 16GB to the model?

giancarlostoro1d ago

Quite the opposite, you want something below your max VRAM and you use up to your max VRAM for the context window, so it can run longer.

https://huggingface.co/Jackrong/Qwen3.5-27B-Claude-4.6-Opus-...

1 more reply

abetusk2d ago· 2 in thread

Moore's law or one of its generalizations still holds, so it will only be a short matter of time before a $1k computer will be able to train and run a powerful enough model.

Windchaser2d ago

I thought Moore's Law came to an end in the last decade?

Certainly the transistors/chip or transistors/$ or flops/$ have not been progressing at the same exponential rate as during 1970-2010. There is still progress, but it's rather slower.

abetusk2d ago

The picture for storage is a little more muddied but if you squint just right you might still be able to recover an uninterrupted exponential growth [2].

In my view, it's pretty clear that advances in AI have progressed so quickly because GPUs have been keeping up with the exponential growth of computational power (per unit cost).

The more general phenomena is called Wright's law, or experience curve effects [3].

[0] https://en.wikipedia.org/wiki/Moore%27s_law

[1] https://ourworldindata.org/grapher/gpu-price-performance?ySc...

[2] https://ourworldindata.org/grapher/historical-cost-of-comput...

[3] https://en.wikipedia.org/wiki/Experience_curve_effect

mbgerring2d ago· 2 in thread

About $2k in 2026 dollars and falling.

simonw2d ago

... or rising, at least as long as there's a RAM shortage.

mbgerring2d ago

I’d bet that there won’t be a RAM shortage for very long.

2 more replies

majormajor2d ago· 1 in thread

> Historically a good Windows computer cost $1000 and it was all it took to start programming.

Gotta remember inflation here.

$1K in 1995 was roughly equivalent to $2K now and wouldn't have been a particularly "good" machine then.

In 1982 the Commodore 64 started at about $600 bucks, also roughly around $2K today.

If you outgrew that, beefier machines back then were A LOT. It was easy to find $2k+ towers and (especially) laptops even into the 2000s, and a lot of those would be $5K+ equivalent today.

SoftTalker2d ago

And a unix workstation in those days could be high 4 or even 5 figures, depending on configuration.

ssivark2d ago· 1 in thread

layer82d ago

bensyverson2d ago· 1 in thread

Yes, between Moore's Law and more efficient model architectures, we just have to let time do its work.

Danox2d ago

skydhash2d ago· 1 in thread

> Historically a good Windows computer cost $1000 and it was all it took to start programming

Supermancho2d ago

In 1998 I had a second PC for the first time in my life. It was an old i486DX2-66 that my dad gave to me. He said it cost him $50 secondhand and was bought so I could put Linux on it.

onel1d ago

Even if you go with an open source AI model, it will still make a lot more financial sense to pay a provider then actually invest in the hardware to self-host it.

It's definitely worth investing in self-hosting the agent infrastructure around the model though: all the documents, knowledge base, all the connectors, the agent itself to run on your hardware

orbital-decay1d ago

crazycracker2d ago

nicoburns2d ago

Kim_Bruning2d ago

Possibly it's the same price range, allowing for inflation.

Thraway1981d ago

With the free tiers available now you can make your own document processors and other such office software, in one pass.

rayiner2d ago

Isn’t this just a bet that I’ll have an AI data center in my iPhone within 10 years? Why is that a bad bet?

epolanski2d ago

If we had 256-512 GB ram unified memories at 2022 prices, we'd be talking 1500 computers.

ktallett2d ago

j / k navigate · click thread line to collapse