undefined | Better HN

0 comments

32 comments · 9 top-level

raincole27d ago· 12 in thread

I had been saying this on HN repeatedly: people are going to use the smartest models for coding. They don't care how cheap your tokens are if they don't have the highest probability of solving your programming tasks.

And I was dead wrong. Now I mostly use DeepSeek Pro myself.

vb-844826d ago

> people are going to use the smartest models for coding. They don't care how cheap your tokens are

I actually think that's still true and will continue to be true as long as someone else subsidizes the tokens. Once the "free money" runs out, things will get interesting.

weitendorf27d ago

I pretty strongly feel the opposite way. Granted I have not used deepseek enough to “know” their model idiosyncrasies as well as Anthropic, so there is a partial skill issue. But I just find it really hard to justify using a less powerful model while I work.

The most I’ve ever spent in a month extra on API tokens for my own work is $200, and I pay for the $200/mo Claude. I use these models quite a lot, though not idly (I usually just walk around and do other stuff until I know how im going to approach the next set of problems). So it costs me about $3000/year to get as much as I want of the best model available. Already that seems low enough to not be worth stressing out too much about optimizing it, because it feels like an indisputable good value, and trying to save money with a less powerful model would be optimizing for a $1000-$2000 saving at the expense of a large portion of my work taking longer or being more frustrating and iterative.

That’s not a flex or anything, I get that in other countries $3000/yr is a lot of money for a software developer and also a lot of people would perhaps rationally be better off doing X% worse at work or spending Y% more time on tasks to save $Z, if their productivity improvements didn’t translate to more salary. Otherwise if your performance has more upside I really do think that the smartest models are better with the current pricing scheme. Deepseek and the other Chinese models spend a LOT of time thinking, and tend to be much more jagged (benchmaxxed) in performance. How can dealing with that over an entire year be worth $2k?

The only situation I can think of where sacrificing my own time/performance to save on inference is batch compute (of course, $1k vs $100k is different from $30 vs $3k) or work where the tier 2 models have crossed the “good enough” threshold. But I think Opus is not even close to that threshold generally yet. As it gets smarter I, and I think most others probably, just try to do harder things faster and hit the next wall.

6AA4FD27d ago

Props for making a falsifiable claim, noticing it was falsified, and owning up to it.

dcchambers27d ago

I think two things happened:

1. The sheer number of tokens that a coding agent can use flipped the math upside down on this equation. If you use the most expensive model for everything those costs quickly become untenable, even for software companies.

2. We realized many of the coding problems we're solving aren't incredibly difficult.

simplyluke27d ago

The other thing that's changing is more and more CFOs are looking at the AI spend in engineering departments and hitting the brakes. Token leaderboards were cool when the spend wasn't a double-digit-percent of the entire department's budget including salaries.

KronisLV27d ago

> And I was dead wrong. Now I mostly use DeepSeek Pro myself.

I've wasted over a hundred Euros re-doing work that was done badly due to the model not being up to task (Vue with TS + wrapper components around PrimeVue, needing to handle event and property passthrough and deal with the stupid Vue SFC issues, TS made this much worse than JS would be). I think it was the GLM model through Cerebras Code at the time, in addition to some GPT and Gemini models with the API pricing.

That said, DeepSeek V4 Pro is pretty good and I can totally see myself offloading some of the work, as long as a better model reviews the work and provides suggestions/tests for it.

bachmeier27d ago

Your comment is a slice of the reasoning underlying the "AI will take all the jobs" claim. I would constantly see references to what AI could do and how fast it was improving. Never a word about cost. We should anticipate that there will always be demand for human labor, for cheap models, for local models, and probably even frontier models.

sergiotapia27d ago

You should try Composer 2.5 within cursor. It's so fast, shockingly fast. Going back to gpt/claude is like using dial-up. And it's great for code work. So far nothing has really tripped it up backend, frontend or reporting metabase dashboard stuff. It's nuts.

jwitthuhn27d ago

Yeah I've also found that models are good enough that the extra spend on premium models isn't always worth it, particularly for my small personal toy projects.

A $20 claude sub goes a long way when you plan with Opus and execute with Sonnet.

zuzululu27d ago

you weren't wrong your tasks/problems didn't warrant a frontier model and it was always solvable with a cheap chinese model

doesn't invalidate the rest of us working on tough problems that demand more expensive models and valuable enough to justify it

culi26d ago

DeepSeek pro is frontier at this point.

peheje27d ago

I mean indsight is 20/20, but saying that is like saying "everyone will just use the best tools". That's not what we see most places in the world for most types of resources.

pants227d ago· 7 in thread

The Chinese models are only cheap on subsidized Chinese hosting. I have yet to find a USA-hosted Chinese model with a very clear value advantage over US models.

wg0OP27d ago

No true. Also - put Deepseekv4 Flash on your local with effort set to "high" and you'll see that many many are using that model on their own machines without paying anyone anything.

Its just that some of us didn't imagine having GPUs would be advantageous and were not gamers on the side. Those who had beefy GPUs or GPU rigs for any reason, they rarely need to go anywhere else.

At least I am so impressed with Deepseekv4 AFTER using Claude Opus 4.7 for significant amount of time that I am not going anywhere but Deepseekv4.

The model is just INSANE. Things I have done with it include attempting to write a 2.5D game engine in C with full animation and map rendering layer by layer.

weitendorf27d ago

There are basically two tiers of "Chinese models" in this context, the "edge" sized ones with ~30B parameters or less, and the big ~1T models that can basically only run in the datacenter.

I don't think it's as simple as saying China's hosting is subsidized, they have generally cheaper electricity and labor costs than in the US and don't have access to the top tier models, and a large internal market where the big models are the best thing they can run with what they have. So obviously they max out on their top models (which are trained with their hardware market in mind, not ours) and get the economy of scale from that, and can run generally the same hardware for less money than in the US because

The edge models are very cheap to run and can do so on inexpensive hardware. They are like 95% cheaper to run than Haiku, so the math is in their favor for certain batch workloads. Most people just run the models for themselves when they do that without making it available on openrouter or whatever, because you can just provision a gpu node and use it as needed, and it's not that expensive to run this family of models.

Is your problem that you want to call Chinese models hosted in the US because you're worried about the data handling?

ekidd27d ago

The Chinese models are surprisingly cheap and performant sitting under my desk. Qwen3.6 27B is nowhere near as autonomous as Opus 4.7, but it runs in 24GB of VRAM. And it's actually great for the use cases where I'm going to carefully read and understand all the code anyway.

If you want to support a team of engineers, DeepSeek V4 Flash is antirez's current favorite. And you could support a team of engineers pretty nicely for $40-50k. Which might not make sense if you're on a Claude MAX 5x plan or the old enterprise group plan with fixed price seats. But Anthropic is switching their enterprise contracts over to token-based pricing, at which point $50k is looking pretty good.

joshhart27d ago

Fireworks will serve them for $1.74 / $0.14 / $3.48. That's input / cached input / output. https://fireworks.ai/models/deepseek-ai/deepseek-v4-pro . Call it about a third the price of Sonnet.

Not nearly as cheap as the Chinese infra but still pretty cheap.

harsh319527d ago

You can find them on Deepinfra. Palo Alto company. Similar cheap price.

__mharrison__27d ago

Odd take. I'm running them locally at my desk (DGX Spark and 128GB MBP). They work fine for 90% of what most folks do. Admittedly, they do run slower on my hw than on the cloud.

slopinthebag27d ago

Huh? They're several times cheaper than SOTA models at market rate prices.

ok12345627d ago· 1 in thread

Qwen3.6:35b is good enough for a lot of stuff.

I just used ollama with a shell script to tackle my directory of papers/literature. I converted the first 6 pages of each document to PNG, handed them off to Qwen, and told it to spit out BibTeX, including the abstract. Two days later it was done, and I didn't spend anything on "tokens."

abyssin26d ago

Why PNG? Isn’t an image format more expensive to process?

mariopt27d ago· 1 in thread

I’ve been using Kimi 2.6, GLM 5.1 , Minimax 2.7 and lately deepseek. I only spend 40$ a month and I don’t see the point in paying for Opus/Codex.

Chinese models are really quite good at a lot of stuff.

fittingopposite27d ago

Which harness?

replwoacause27d ago· 1 in thread

Anybody know what the most capable Chinese model is that can be used in production and is cheaper than US frontier models? Would that still be Deepseek? My interest is getting as close to Gpt5.5 or Opus quality as I can get, but for less $.

culi26d ago

Depends what you want it for. Probably Qwen

https://arena.ai/leaderboard

reppap27d ago· 1 in thread

The problem with going for open source models is that you are betting on some third party to keep doing expensive model training and releasing it for free, forever. What do you do if deepseek never release another update to the model?

julianlam27d ago

I continue to use the model I downloaded... for free?

SoftTalker27d ago

> CFO/CTOs might find out that deploying on an internal cluster of GPUs is far more cheaper and reliable

I think you're right especially if you're someplace that already has a data center, such as a university. Solves a lot of privacy concerns as well.

raylad26d ago

Possibly a deliberate strategy by the Chinese to undermine the US AI industry, data centers, and basically everything that’s powering the economy.

Just like they did with the US steel industry in the 80s.

surgical_fire27d ago

I am having some great experience with DeepSeek. In fact, it seems to perform better than Claude or Codex in my use case.

I don't see myself returning to Claude or Codex anytime soon.

j / k navigate · click thread line to collapse