undefined | Better HN

0 pointsjerojero3d ago0 comments

Open weight models from Chinese labs tend to be significantly cheaper.

I think theyre absolutely needed. I can't afford 200 USD a month for personal use of coding AI, and I don't think such prices are reasonable for most of the world economy anyway. Not to mention US firms might be giving their employees a lot more than that.

It's increasingly feeling, to me, that theres a gap building up between haves and have nots. But then, we get news of these open weight models that are reasonably priced in inference with reasonable capabilities. Yes, they take maybe 6-9 months to get there, tbh, that's not a bad trade off at all.

0 comments

63 comments · 13 top-level

ttoinou3d ago· 13 in thread

200 is much less than the value you’re supposed to get out of it. If it’s not then yeah go ahead and use cheaper models with worst quality

martinjc2d ago

Are you aware of how much purchasing power 200 dollars is in china, brazil, thailand or india is? This is an extremely arrogant take.

nwienert1d ago

I’ve hired many asian developers anywhere from 1-4k a month.

I get a lot more out of a 200/mo subscription now in a week than I did from them in a month.

Now obviously in today’s world they’d be using a 200/mo subscription themselves. But it’s not like money is nothing, software development doesn’t scale down below 1k/mo for anyone competent even in the poorest areas.

1 more reply

dash21d ago

Parent’s point was that many many people will get much more than $200 value from the “expensive” model. Sure, a Bihar farmer won’t, but even an Indian software developer may easily do if he or she has Western clients.

mrngld1d ago

What's that got to do with the cost of a thing? Are tradesmen in Thailand entitled to Makita tools just because American plumbers can afford them? I'm struggling to understand the entitlement in some of the comments. And even though it doesn't matter I'd point out it's not like OpenAI or Anthropic are making enormous profits at the moment.

matheusmoreira1d ago

For the record, 200 USD is around 60% of the brazilian minimum wage.

1 more reply

Dayshine3d ago

I'm not sure how I'm supposed to get $200 of value out of personal use!

LPisGood2d ago

Note that 200 dollars of value is different than 200 dollars of profit.

devmor2d ago

I personally don’t find it that useful for most tasks, but if say, you get paid $50/hr for your work and it saves you more than 4 hours of work in a month, there you go.

2 more replies

holoduke2d ago

Here most of my colleagues have +200 dollar rates. It's really a no brainer. But sure, in south America or some Asian countries maybe it is. But still most devs need it anyway. Also in the poor regions.

3 more replies

uberex2d ago

Unless that value is $200 cash in hand it will be hard to afford it for people who just don't have $200.

margalabargala1d ago

Last time you bought a computer, did you buy the absolute fastest best CPU available?

girvo1d ago

Yes, but that was because I could see the writing on the wall with respect to hardware prices being cooked by AI demand, so I built the best computer possible at the time knowing it'd probably need to last me the next 5+ years

So not really comparable. I use Step 3.7 Flash locally, models are good enough for so many coding tasks even at the lower end! (Though I note that calling a 200B model "lower end" is kind of amusing)

smrtinsert1d ago

I've actually come to believe the overwhelming majority of use cases require nowhere frontier quality so there's that. Much faster execution is just a bonus on top of the much reduced cost

fbrncci1d ago· 8 in thread

You made me realize something. I routinely spend upwards of 500$ per month on LLMs for coding (expensed towards clients). However I live in a place where 500$ is around the avg. salary. I’m lucky that I know my way around western clients. Clients who pay these expenses and are happy to work with me because I am still about 50% cheaper than local talent in EU/US, while my salary at home converts to an upper class income at the highest tax bracket.

Which of course causes some unfairness on both ends. Nobody here can compete with me. I often use left over tokens on local client projects; which despite lower pay, still pays off because they now take hours not days or weeks to complete. And nobody in the local clients talent pool can compete with me; unless they charge about half the market rate.

Take away my 500$ monthly grant; and I’d be more or less screwed. Better open models will more or less start to reduce this advantage. It’s not like I positioned myself here on purpose. But it’s definitely a „right place, right time“ situation.

whazor1d ago

The problem is that the differences between flagship and local models are compounding heavily. An 4% different could be massive when you keep iterating on the same code base.

swiftcoder1d ago

> The problem is that the differences between flagship and local models are compounding heavily

This depends a lot on how you work, and how much of the architectural thinking you do yourself.

People seem to lose sight of the fact that a flash model today is as powerful as a frontier model from a year ago. If you were happy with GPT 4.x, you should be ecstatic that equivalent power is now basically free...

2 more replies

listic1d ago

Thanks for sharing your insight.

Mind if I ask you for a few vibe coding tips? I failed to solve you gh puzzle in the profile though.

swader9991d ago

If you are running multiple agents your cost to them should be multiples less what their roi is.

fbrncci1d ago

My costs are 0$ as any token or subscription spend on agents is invoiced as an expense to my clients.

1 more reply

lanthissa1d ago

AI is the first technology that doesn't incentivize offshoring, and incentivizes co-location of talent.

A NYC dev and a dev in india have the same ai costs, based the ratio tokens/salary it becomes less of comparative disadvantage to be in NYC.

Now combine that with the fact that AI makes the act of generating code less a % time of the job, and the ability to get/refine requirements more of the job and you have a decent shift.

fbrncci22h ago

The tokens/salary ratio is not relevant at all. Because while 200-500$ is a lot of money, it’s still a fraction of the salary you’d pay any dev in the world. It just comes out as a tooling expense. It also matters how those devs use the tools; you can’t assume everyone gets the same out of it. So that amount can last a day or it can last a month. I would say a dev in a developing nation would be more budget aware than someone being used to everything being priced in NYC rates.

For example I build other AI products and I have been hyper aware of the token spend of our users. I was going crazy seeing that some users were having 5$ conversations. So that was optimized and I found ways to use sub agents to get it down to 1-2$. Just for management asking me why I was worrying to begin with? The users using these are consultants being paid 120$ per hour. They have a daily 10-20$ token expense, no problem. “But amazing job on the cost reduction.”.. well 5$ for me is what I spend on food daily. While the consultant is slamming: “yes” 10 times in a chat , for whatever reason for the same cost. Would the NYC dev care as much natively? No.

You can still hire three devs in India for the price of a dev in NYC. Now you give them AI and you might only need 1-2. That makes offshoring even more appealing, not less. And the dev in India now having tooling to out compete local talent. Well that’s my reality (I am not in India though).

Sammi1d ago

Errr you just responded to someone that is offshore and is using AI to be much cheaper than local talent.

arikrahman1d ago· 8 in thread

Someone else on this forum put it well, U.S. is trying to achieve AGI at all costs, while Chinese models are seeking widespread adoption.

rglullis1d ago

> U.S. is trying to achieve AGI at all costs

If that was true, they would be collaborating with each other and opening up all the results from their work.

HappMacDonald16h ago

.. "NOBUS" AGI with a moat at all costs..

azinman21d ago

I don't think anthropic/openai/google aren't also seeing widespread adoption. In fact they already have they already have the marketshare.

Turskarama1d ago

The difference is that the US companies are using it as a means to an end, they need to make just enough profit that the investors don't all get cold feet before they get to AGI. The Chinese companies on the other hand are trying to be profitable immediately, which means that they're going slower to save development costs.

tsss1d ago

Everyone wants widespread adoption, of course. I'm sure that China is also working on more expensive frontier intelligence models behind doors, but they're lagging behind America on that front. Going for cost-optimized open weight models is their bet to stay relevant in a market where they can't compete for the "luxury" segment. It is important for them to get a foot in the door and maintain a presence in the press to attract future customers, given the general animosity towards China in the west that they need to overcome. Similarly, European providers like Mistral are hopelessly outclassed in every respect and thus try to carve out a niche in the market with regulation and anti-American fearmongering. They position themselves as "privacy-conscious" not out of goodwill but because it is their only chance to survive as a company with an utterly inferior product.

arikrahman1d ago

I didn't think of the European angle, that's something I can update in my synthesis.

lionkor1d ago

None of the AI companies in the US are on the path to AGI. They are, however, on the path to claiming they have AGI, then subsequently not releasing it and only giving it to the US government to make drones that can bomb the homes of political dissidents.

dotancohen1d ago

What kind of off topic political ideology spam is this? Do you not think that the Chinese kill their enemies?

The Chinese are genociding Uyghurs as we speak, purely for being Muslim, in numbers that dwarf any harm the US has done.

2 more replies

tacomagick2d ago· 5 in thread

DeepSeek through their own API has saved me tons of tokens honestly. Even though it is not as smart as Kimi or Claude, their level of entry is very low with a top up of 2$ and Pay as you go compared to the subscription of Claude or 20$ top up of Kimi

praveer132d ago

For personal use I’m considering using the frontier models from openai or anthropic to create a plan with research and brainstorming etc with enough details for cheap models to be able to follow (glm, deepseek etc) - with openrouter - will monitor how cheap and effective that turns out to be.

ImaCake1d ago

You should try out the cheaper models first. I find Deepseek v4 models pretty comparable to sonnet 4.6 but at a fraction of the cost. You might find you just don't need to use the American models at all.

lionkor1d ago

Seconding the recommendation to use Deepseek directly via the API. I've burnt 287 million tokens in the last couple of days, costing me a whopping $5.77 USD.

tacomagick1d ago

For my case Openrouter breaks Deepseek caching and charges me multiple times over what I pay for Deepseek's API, with 2$ I was able to get around 120M tokens from deepseek easily when Openrouter could only barely do 250k

1 more reply

mdjxnxnxnd1d ago

I call this the reviewer/implementer pattern.. Opus for planning then ds4/qwen/kimi for.implementation then opus for PR review

brian-armstrong1d ago· 5 in thread

I read these stories and I can never figure out how people are managing to use these $200 plans. If I really go full bore, I can sometimes max out the $20 plan. Even then, it already produces more code than I can reasonably review and merge.

ipaddr1d ago

I've maxed out my chatgpt plus the first week and that include an smf forum rewrite. Trying my best I haven't been able to max out again. Things are setup that you need to max out your 5 hour window multiple times which becomes a job in itself.

At work I'm struggling to keep my claude bill around $500.

girvo1d ago

Simple: a lot of the people claiming they’re reviewing the output of these models are lying.

Also if you run the “loops” they’re now yapping about, it will burn through enormous amounts of usage as well.

theoli1d ago

Exactly this, it’s the loops. The first 50k tokens of a task is by far the most valuable. But when left to run independently, the agent will consume millions of tokens of error messages from running tests and discovering a minor syntax error, a missing import, a method call with incorrect parameters, etc. Then it will write some helper program while debugging the main task and get into the same loop debugging minor errors in the helper. From my experience, the vast majority of tokens consumed by Claude Code on totally independent tasks are consumed fixing minor mistakes it just made.

hgomersall1d ago

I can't even keep up with the chain of thought needed to manage a single session, let alone review. I typically never exceed 30% of a 5x plan. Fable took me almost to the limits, but not Opus. Claude design hits things harder, but still not to saturation.

RugnirViking1d ago

do you do it for a job (8 hours a day)? and do you work in large, mature projects (more than 5 team members)? A big part of it is dealing with frankly terrible architecture and 15 people's different ideas of how things should work (and the spam theyve been able to do with their own agents makes this worse)

Fr0styMatt881d ago· 4 in thread

If we can agree that the AI model is at least as capable as a junior engineer or new contractor, how’s that different to saying “software engineering isn’t worth $200 a month”?

Has a very race-to-the-bottom feel to it.

Though in the grand scheme of it, $200/mo probably isn’t the real price either. Also looking at it not just in a vacuum - paying for a product that can change what you get from under you doesn’t seem great anyway.

At least with a locally-hosted model you know what you’re getting.

matheusmoreira1d ago

Yeah. There's no way to verify what these providers are doing. The real future is running these models at home. Opus level inference on our own hardware would be a dream come true.

baq1d ago

I dream of having an LLM in a box over usb bought off AliExpress for a year and change now.

The LLM in a box is something you can buy today, but it 1. doesn’t serve over usb by default 2. costs $100k for hardware (not counting electricity) at 100 tps 3. can’t buy this from AliExpress.

Better to put that $100k in t-bills and just buy tokens even at api prices.

1 more reply

IncreasePosts1d ago

How will anyone running home instances be able to compete against people paying some money running much more powerful models on much more powerful hardware?

3 more replies

RazorBucksICO1d ago

The appropriate price is what the output is worth to you. Some people could pay $10,000/month, some $5 and feel like they were breaking even. There is a big jump between convenience and curiosity uses versus business critical.

OpenAI already charges enterprise users a premium purely for that title over on-demand, no-contract usage. Retail users get a good deal. People make a lot of hay about subsidies but this is a very sane approach if you want exposure to these three different types of customers.

giancarlostoro1d ago· 2 in thread

As much as I don't like Mark Zuckerberg, part of me wishes he would get his head in the game and compete with these models, he's literally got all the capability to do so, and he could easily sell the model through deals with GCP, AWS, and Azure. Hell, Amazon needs a hot model they can host that's exclusive to them I feel like, maybe he can work something out with them, whatever the case, it seems so glaringly obvious to me, I'm not sure why he hasn't taken a stab at competing with Claude Code or at least frontier open models and then cutting a deal with cloud providers to recoup the costs of maintaining said models.

He's sitting on a frontier model letting it burn a hole in his wallet that could actually pay for itself.

khurs1d ago

Meta internally have been using Google Gemini

"Meta has been using Google’s Gemini large language model for most of its moderation and customer support, but staff have recently been told to switch to Meta’s new foundational model, Muse Spark, the people said."

https://www.ft.com/content/39251a31-4a9d-4870-b86c-dc6353d67...

giancarlostoro1d ago

It feels really insane to me that they have a model that could be better, but its just sitting there burning a hole in his wallet instead as he chases trying to recreate Grok's companion thing.

ImaCake1d ago· 2 in thread

Significantly cheaper than comparable models if you are using openrouter [0]. Just yesterday I spent roughly 13 cents centering some divs using Deepseek in a personal project. It would have been north of $1 to do that with a US frontier model.

0. https://openrouter.ai/compare/z-ai/glm-5.2/anthropic/claude-...

ipaddr1d ago

For centering divs the free models opencode offers can easily handle that work. DeepSeek V4 Flash is pretty decent.

ImaCake1d ago

Sure, but something that is “sonnet tier” is going to get there faster and with less pain. Well worth the 13 cents!

1 more reply

throwaway-blaze1d ago· 2 in thread

Just don't ask it to tell you the events of June 4, 1989.

swingboy1d ago

My work involves asking LLMs about both Tianenmen Square and what’s going on in Gaza, so I can’t use Chinese or American models!

girvo1d ago

Not that it matters but most of the open weight models aren’t actually censored that way: they run another layer on top of to do that. At least some of them do, Step 3.7 Flash locally happily tells me about the Tiananmen Square massacre

cameldrv1d ago· 1 in thread

Yes, but you’re paying with your data unless you’re hosting with a provider you trust or self-hosting.

sixothree1d ago

My first instinct has been - well this is an open source project, what does it matter. But even then, I am guessing that using their service even for open source projects still provides them some value.

narrator1d ago

The tokens cost the same everywhere on earth. This does hurt some cost advantages of outsourcing when tokens start to become a bigger part of development costs.

matheusmoreira1d ago

> It's increasingly feeling, to me, that theres a gap building up between haves and have nots.

People speak of a permanent underclass.

https://www.nytimes.com/2026/04/30/opinion/ai-labor-work-for...

alpineman1d ago

With open weight models there is true inference competition. Whoever can serve the model at the lowest price. And the consumer wins. Capitalism, served by China.

j / k navigate · click thread line to collapse

0 comments

63 comments · 13 top-level

ttoinou3d ago· 13 in thread

200 is much less than the value you’re supposed to get out of it. If it’s not then yeah go ahead and use cheaper models with worst quality

martinjc2d ago

Are you aware of how much purchasing power 200 dollars is in china, brazil, thailand or india is? This is an extremely arrogant take.

nwienert1d ago

I’ve hired many asian developers anywhere from 1-4k a month.

I get a lot more out of a 200/mo subscription now in a week than I did from them in a month.

1 more reply

dash21d ago

mrngld1d ago

matheusmoreira1d ago

For the record, 200 USD is around 60% of the brazilian minimum wage.

1 more reply

Dayshine3d ago

I'm not sure how I'm supposed to get $200 of value out of personal use!

LPisGood2d ago

Note that 200 dollars of value is different than 200 dollars of profit.

devmor2d ago

I personally don’t find it that useful for most tasks, but if say, you get paid $50/hr for your work and it saves you more than 4 hours of work in a month, there you go.

2 more replies

holoduke2d ago

3 more replies

uberex2d ago

Unless that value is $200 cash in hand it will be hard to afford it for people who just don't have $200.

margalabargala1d ago

Last time you bought a computer, did you buy the absolute fastest best CPU available?

girvo1d ago

So not really comparable. I use Step 3.7 Flash locally, models are good enough for so many coding tasks even at the lower end! (Though I note that calling a 200B model "lower end" is kind of amusing)

smrtinsert1d ago

I've actually come to believe the overwhelming majority of use cases require nowhere frontier quality so there's that. Much faster execution is just a bonus on top of the much reduced cost

fbrncci1d ago· 8 in thread

whazor1d ago

The problem is that the differences between flagship and local models are compounding heavily. An 4% different could be massive when you keep iterating on the same code base.

swiftcoder1d ago

> The problem is that the differences between flagship and local models are compounding heavily

This depends a lot on how you work, and how much of the architectural thinking you do yourself.

2 more replies

listic1d ago

Thanks for sharing your insight.

Mind if I ask you for a few vibe coding tips? I failed to solve you gh puzzle in the profile though.

swader9991d ago

If you are running multiple agents your cost to them should be multiples less what their roi is.

fbrncci1d ago

My costs are 0$ as any token or subscription spend on agents is invoiced as an expense to my clients.

1 more reply

lanthissa1d ago

AI is the first technology that doesn't incentivize offshoring, and incentivizes co-location of talent.

A NYC dev and a dev in india have the same ai costs, based the ratio tokens/salary it becomes less of comparative disadvantage to be in NYC.

Now combine that with the fact that AI makes the act of generating code less a % time of the job, and the ability to get/refine requirements more of the job and you have a decent shift.

fbrncci22h ago

Sammi1d ago

Errr you just responded to someone that is offshore and is using AI to be much cheaper than local talent.

arikrahman1d ago· 8 in thread

Someone else on this forum put it well, U.S. is trying to achieve AGI at all costs, while Chinese models are seeking widespread adoption.

rglullis1d ago

> U.S. is trying to achieve AGI at all costs

If that was true, they would be collaborating with each other and opening up all the results from their work.

HappMacDonald16h ago

.. "NOBUS" AGI with a moat at all costs..

azinman21d ago

I don't think anthropic/openai/google aren't also seeing widespread adoption. In fact they already have they already have the marketshare.

Turskarama1d ago

tsss1d ago

arikrahman1d ago

I didn't think of the European angle, that's something I can update in my synthesis.

lionkor1d ago

dotancohen1d ago

What kind of off topic political ideology spam is this? Do you not think that the Chinese kill their enemies?

The Chinese are genociding Uyghurs as we speak, purely for being Muslim, in numbers that dwarf any harm the US has done.

2 more replies

tacomagick2d ago· 5 in thread

praveer132d ago

ImaCake1d ago

lionkor1d ago

Seconding the recommendation to use Deepseek directly via the API. I've burnt 287 million tokens in the last couple of days, costing me a whopping $5.77 USD.

tacomagick1d ago

1 more reply

mdjxnxnxnd1d ago

I call this the reviewer/implementer pattern.. Opus for planning then ds4/qwen/kimi for.implementation then opus for PR review

brian-armstrong1d ago· 5 in thread

ipaddr1d ago

At work I'm struggling to keep my claude bill around $500.

girvo1d ago

Simple: a lot of the people claiming they’re reviewing the output of these models are lying.

Also if you run the “loops” they’re now yapping about, it will burn through enormous amounts of usage as well.

theoli1d ago

hgomersall1d ago

RugnirViking1d ago

Fr0styMatt881d ago· 4 in thread

If we can agree that the AI model is at least as capable as a junior engineer or new contractor, how’s that different to saying “software engineering isn’t worth $200 a month”?

Has a very race-to-the-bottom feel to it.

At least with a locally-hosted model you know what you’re getting.

matheusmoreira1d ago

Yeah. There's no way to verify what these providers are doing. The real future is running these models at home. Opus level inference on our own hardware would be a dream come true.

baq1d ago

I dream of having an LLM in a box over usb bought off AliExpress for a year and change now.

The LLM in a box is something you can buy today, but it 1. doesn’t serve over usb by default 2. costs $100k for hardware (not counting electricity) at 100 tps 3. can’t buy this from AliExpress.

Better to put that $100k in t-bills and just buy tokens even at api prices.

1 more reply

IncreasePosts1d ago

How will anyone running home instances be able to compete against people paying some money running much more powerful models on much more powerful hardware?

3 more replies

RazorBucksICO1d ago

giancarlostoro1d ago· 2 in thread

He's sitting on a frontier model letting it burn a hole in his wallet that could actually pay for itself.

khurs1d ago

Meta internally have been using Google Gemini

https://www.ft.com/content/39251a31-4a9d-4870-b86c-dc6353d67...

giancarlostoro1d ago

It feels really insane to me that they have a model that could be better, but its just sitting there burning a hole in his wallet instead as he chases trying to recreate Grok's companion thing.

ImaCake1d ago· 2 in thread

0. https://openrouter.ai/compare/z-ai/glm-5.2/anthropic/claude-...

ipaddr1d ago

For centering divs the free models opencode offers can easily handle that work. DeepSeek V4 Flash is pretty decent.

ImaCake1d ago

Sure, but something that is “sonnet tier” is going to get there faster and with less pain. Well worth the 13 cents!

1 more reply

throwaway-blaze1d ago· 2 in thread

Just don't ask it to tell you the events of June 4, 1989.

swingboy1d ago

My work involves asking LLMs about both Tianenmen Square and what’s going on in Gaza, so I can’t use Chinese or American models!

girvo1d ago

cameldrv1d ago· 1 in thread

Yes, but you’re paying with your data unless you’re hosting with a provider you trust or self-hosting.

sixothree1d ago

narrator1d ago

The tokens cost the same everywhere on earth. This does hurt some cost advantages of outsourcing when tokens start to become a bigger part of development costs.

matheusmoreira1d ago

> It's increasingly feeling, to me, that theres a gap building up between haves and have nots.

People speak of a permanent underclass.

https://www.nytimes.com/2026/04/30/opinion/ai-labor-work-for...

alpineman1d ago

With open weight models there is true inference competition. Whoever can serve the model at the lowest price. And the consumer wins. Capitalism, served by China.

j / k navigate · click thread line to collapse