DeepSeek makes the V4 Pro price discount permanent (opens in new tab)

(api-docs.deepseek.com)

621 pointsTiberium29d ago549 comments

> (3) The deepseek-v4-pro model API pricing will be officially adjusted to 1/4 of the original price after the 75% discount promotion ends on 2026/05/31 15:59 UTC.

https://x.com/deepseek_ai/status/2057854261699195173

Related ongoing thread:

DeepSeek reasonix, DeepSeek native coding agent with high caching and low cost - https://news.ycombinator.com/item?id=48256953 - May 2026 (135 comments)

DeepSeek makes the V4 Pro price discount permanent

(api-docs.deepseek.com)

621 pointsTiberium29d ago549 comments

> (3) The deepseek-v4-pro model API pricing will be officially adjusted to 1/4 of the original price after the 75% discount promotion ends on 2026/05/31 15:59 UTC.

https://x.com/deepseek_ai/status/2057854261699195173

Related ongoing thread:

DeepSeek reasonix, DeepSeek native coding agent with high caching and low cost - https://news.ycombinator.com/item?id=48256953 - May 2026 (135 comments)

549 comments

275 comments · 67 top-level

alyxya29d ago· 39 in thread

Once they have their own coding agent which they seem to be working towards, I may start predominantly using their models. They seem to be doing all the "right" things, open sourcing models, publishing research, and keeping prices low for everyone.

ammar_x29d ago

You can use V4 Pro with Claude Code [1].

I tried it and it's impressive.

[1]: https://api-docs.deepseek.com/quick_start/agent_integrations...

KronisLV28d ago

I'm working on a custom launcher for hooking up Claude Code with various providers (groups env variables in profiles) cause DeepSeek doesn't have vision and sometimes I need browser use with screenshots or Opus reasoning, for other tasks it's fine: https://ccode.kronis.dev/

  # After installed (or when run portably with ./ccode)
  ccode init-config
  ccode edit-config
  
  # Run with default profile
  ccode
  # Run with named profile
  ccode --deepseek
  
  # Set default profile
  ccode set-default-profile deepseek

Also turns out that with a local proxy you can get Remote Control working and see the DeepSeek sessions in the desktop app, screenshots on the page. Other than that, I'm happy that it works pretty well and the discount is enough to make me consider going from Anthropic's Max subscription to Pro and using it only where DeepSeek is insufficient. With that proxy I eventually hope to be able to transparently switch models mid-task, if I need Opus for like 5 turns or something.

Overall though I'm not sure exactly how well Claude Code would stack up against OpenCode, since the latter overall feels a bit less hacky with 3rd party models and is even getting niche but nice features like a locally runnable web version: https://opencode.ai/docs/web/

BiraIgnacio28d ago

I've been using V4 flash consistently with Claude. Pretty great fast and darn cheap. I use it about 3h/day and so far haven't crossed $1 USD/week.

FWIW, I this is what I have in my settings.json

  "env": {
    "ANTHROPIC_AUTH_TOKEN":"sk-nope_not_real",   
    "ANTHROPIC_BASE_URL": "https://api.deepseek.com/anthropic",
    "ANTHROPIC_MODEL": "deepseek-v4-flash",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "deepseek-v4-flash",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "deepseek-v4-flash",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "deepseek-v4-flash",
    "CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1",
    "CLAUDE_CODE_EFFORT_LEVEL": "low",
    "CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING": "1",
    "CLAUDE_CODE_DISABLE_THINKING": "0",
    "CLAUDE_CODE_ENABLE_AWAY_SUMMARY": "0",
    "CLAUDE_CODE_SUBAGENT_MODEL": "deepseek-v4-flash",
    "CLAUDE_CODE_MAX_OUTPUT_TOKENS": "8000",
    "CLAUDE_CODE_FILE_READ_MAX_OUTPUT_TOKENS": "4000",
    "BASH_MAX_OUTPUT_LENGTH": "20000",
    "CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "60",
    "CLAUDE_CODE_AUTO_COMPACT_WINDOW": "200000",
    "CLAUDE_CODE_DISABLE_GIT_INSTRUCTIONS": "1"
  }

4 more replies

rjh2928d ago

How does the cost compare using the API vs the $20/month plans with other providers?

I did some back of the envelope calculations and it seems like you would pay $5/month using DeepSeek directly or $15-20 with OpenRouter or similar. But would be interested to hear real world usage.

2 more replies

maxdo28d ago

I'm not curious what tasks you tested it for. Im working on coding agent writing code dynamically on request for customers. i'd say code itself very simple and aggressively cached, and patternalized, e.g. we adding lots of hints to the system.

the only real family models that work were claude and openai, surprisingly, for tasks that needs faster speed, gpt 5.4 is very impressive. Deep seek was very average , doing things somewhere in gemini flash 3.0 domain.

thisisit28d ago

I am curious - Is there a way to switch between models depending on the task? Because I believe Deepseek V4 is not multimodal and it will be good to switch back to Claude if vision or other capabilities are required.

3 more replies

firecall28d ago

It seems you can use the Claude Code CLI harness without a Claude Pro subscription now, which I don't think you could a before?

I've been using Deepseek v4 with Cline in VS Code as a replacement for Github Copilot, and it's not been too bad.

hbarka28d ago

The npm install of Claude Code deprecated, since Feb 2026.

Scarbutt29d ago

Surprised Anthropic hasn't done anything to restrict Claude Code from using other providers.

4 more replies

wiradikusuma28d ago

That's interesting. I thought Claude Code is not as good, therefore people want to use Claude model with other alternatives. This is the other way around.

Which begs the question, regardless of the model, which Claude Code alternative is better? (I keep saying "Claude Code alternative" because I don't know the term... LLM CLI?)

6 more replies

LaurensBER28d ago

It works very well with OpenCode. My team keeps hitting the 5h limits on other subscriptions and it's pretty good to have Deepseek as a backup. I just put 50 bucks on there and it feels like it'll never run out.

It's not good enough to fully replace any of the frontier models yet but it's definitely great to have as a backup!

lambda29d ago

Why do you need them to provide a coding agent? Just use their model with any off the shelf coding agent. I happen to prefer Pi, but use whatever works for you.

hootz29d ago

Yeah, I'm using Pi with their models through an OpenCode Go subscription and it works pretty well. 10 bucks and V4-Flash is virtually infinite.

alyxya29d ago

I probably have an unfounded assumption that whatever coding agent they make will work really well with their models, better than external harnesses. I don't have a good sense for how all the model + harness combinations compare, nor any good way to compare them myself, but generally believe model companies train their models to work best with their own harness.

2 more replies

apitman29d ago

What's the best way to use it with Pi, OpenRouter?

4 more replies

satvikpendem29d ago

RL with the harness inputs and outputs of users is one of the primary improvers of model performance, a self perpetuating flywheel.

smoe28d ago

Earlier this week I started testing Chinese models on my codebase. I haven’t really looked at interactive coding yet, but more at issue triage, bug auto-fixing, log analytics, etc.

I used DeepSeek, Kimi, GLM, Qwen, and MiMO against GPT-5.5 high as reference, all running in Pi harness without anything installed.

So far, Kimi and MiMO look the most promising to me. I haven’t tested them rigorously enough to make a strong statement, but my first impression is that, in practice, all those models may be less behind on typical daily tasks than people think.

They are a bit “work hard, not smart". Getting to same-ish results more slowly and using more tokens, but at a fraction of the price

try-working28d ago

I just did a little comparison using benchmarks for GPT 5.1 through 5.4 to map out the equivalent capability-level of some of the Chinese models.

Based on these benchmarks, here's a rough mapping:

- Qwen 3.7 ~= GPT 5.3

- Kimi K2.6 ~= GPT 5.15

- DS V4 ~= GPT 5.1

So yes, we have GPT 5 at home now. No need to pay the Legacy Labs anymore.

Here's the benchmark I used since I can't post images here: https://x.com/trydotworks/status/2058004995195490706?s=20

_under_scores_28d ago

I switched to predomentantly using mimo this week, mostly out of curiosity to see how dependant I was on frontier models. Honestly I cant really tell the difference. I would say I work on pretty average codebases with well know frameworks doing pretty typical things and initial impressions is that mimo, kimi and deepseek can probably handle what I need more or less the same as gpt5.5 or claude.

c0rruptbytes28d ago

I personally really like DS4 Flash - it's the largest I can run locally with decent speeds and I feel like it's good enough to maintain a codebase with less effort

1 more reply

maxdo28d ago

maybe i need to give it second chance, surprisingly Kimi 2.6 consistently fail even to generate valid json plan, where gemma 4 was doing really good, but slow.

1 more reply

jdboyd28d ago

I would prefer a coding agent to be somewhat independent of the model provider. Providers are trading off on quality, features, and price so frequently, and I don't want to keep changing my agent every time.

I am looking forward to things slowing down and stabilizing. I'm not saying that should happen today, just I am looking forward to it.

gaolei888828d ago

I think this will happen much sooner than we thought. Maybe it will happen in next 6 months

2 more replies

tequila_shot29d ago

You no longer need "their coding agent". You can hook up claude code to use Deepseek. Works perfectly.

minimaxir28d ago

Zed's Agent natively supports a DeepSeek API key now. (do not use it through OpenRouter if you want to save the most cost)

potsandpans28d ago

Give pi a try if you haven't already. Avoid vendor harness lock-in.

vinhnx28d ago

You can use DeepSeek with my coding agent VT Code. Recently I've added DeepSeek V4 Pro and DeepSeek V4 Flash support with all providers, via: Official DeepSeek API, HuggingFace, Ollama Cloud, OpenRouter providers.

> https://github.com/vinhnx/vtcode

zozbot23429d ago

antirez's ds4-agent works quite fine. It runs on any Apple Silicon device with 96GB RAM or more.

rjh2928d ago

I wonder how many years it'll take for the API token cost to exceed the money spent on ram.

1 more reply

vrganj28d ago

Anything that runs with 64?

1 more reply

raincole28d ago

All the major coding agents already support DeepSeek.

cultofmetatron29d ago

open code works with them today. I've been using it fulltime for 2 weeks so far.

sunaookami29d ago

Using it with Pi and can only report good thing so far. I'm very impressed by how good it is (also it's way slower than Claude Sonnet and GPT-5.5 and often thinks "too much" before starting).

azinman226d ago

And not letting you opt out of being their training data.

teekert28d ago

Why not OpenCode? Genuine question, not an expert..

ReptileMan28d ago

Both pi, opencode and zed work amazing with deepseek.

Guillaume8628d ago

You seem to have tried a few things, if you don't mind I have a few questions as someone currently on Claude Code but would prefer to not lock myself in a commercial ecosystem (and their pricing change regarding headless usage is annoying me):

- how do/would you add the WebSearch tool to your harness? pay for a separate service or does deepseek offer something with their subscriptions?

- do pi/opencode support pasting images in prompts?

- how do you handle reading images? deepseek is not multi modal IIRC? do you pay for another model and route to it?

Any of these missing would really annoy me in day to day use...

2 more replies

linzhangrun28d ago

there already is a open-sourced deepseek-tui coding agent. besides, you can always connect to opencode.

jack_pp28d ago

i have done some amazing things for 5 dollars, using opencode. give it a shot, it is incredibly cheap

Nifty392927d ago· 28 in thread

China may be subsidizing this for now in a way that US companies can't or won't - but if they keep building power infrastructure and the US doesn't, then it will no longer require subsidy from them. It will simply be absolutely cheaper (including profit margin) to serve tokens in China.

China is building for the future, while Western Democracies are afraid of the future, and of their own shadow.

hedora27d ago

I'm not sure how much of it is subsidies. If the open weight models are anything to judge by, China is taking price performance seriously, and the US model vendors are looking for performance at any cost. Like any other Pareto optimization, we end up paying 10x more for the last few percent improvement on benchmark scores.

Of course, like literally every other time this has played out in computing history, the companies focused on price performance will end up with more economic resources, and get to turn the upgrade crank more often and for longer.

Also, of course, China's way ahead of the US on things like renewables, batteries, and electrification of their economy. All of that feeds into cheaper power to run the models, but I suspect it's a second order effect vs. "improve the software".

6 more replies

onlyrealcuzzo27d ago

> China may be subsidizing this for now in a way that US companies can't or won't

They're subsidizing this in many ways - Huawei chips, new DDR5 memory fabs, etc.

Ultimately, DeepSeek's architecture is significantly more cost effective than anything from Google, OpenAI, or Anthropic.

Presumably, they'll incorporate DeepSeek's MLA* architecture to get all the benefits for next year's releases (if not this year's upcoming releases) which will bring down their costs...

They need to actually make money, though, so that might still not give them enough room to make enough money.

Ultimately, hardware depreciation is like 80% of total spending. So power is not as big of a deal in cost. The bigger problem is if you can get the power at all, not how expensive it is.

If you want to bring down inference costs, using less hardware is far more effective than getting cheaper electricity.

Google is in a sweet spot, because they aren't paying 80% margins to nVidia for hardware. So they're probably paying half as much deprecation as everyone else is (or maybe 1/4th for inference - which is now the biggest percentage overall).

4 more replies

toddmorey27d ago

It feels like the US for years has operated under the assumption that homeostasis for the global economy would always be “designed in California, assembled in China.”

Like there was something in the American DNA that was lacking in China and innovation would always need to happen here.

But China it seems doesn’t need the US to produce great cars, devices, robotics, or AI. We absolutely need China to help us build all of the above.

9 more replies

bcrosby9527d ago

Put another way: if the average US citizen doesn't subsidize the costs of these trillion dollar companies, China is gonna come get you. Funny that you talk about being afraid of your own shadow.

I have some exposure to utility regulation and from what I can tell some of the AI companies are "good actors" and willing to shoulder some of the burden. But others are pretty adversarial and want a free lunch.

2 more replies

gmerc26d ago

Remember kids, in the west it’s “investment”, in China “subsidy”

Aboutplants27d ago

I believe you are right. These models are at worst a 6 month lag to the costly frontier models, but the ability to scale energy production is years ahead of where the US is. That advantage is often under appreciated

Their cost of energy is what matters vs the US as much as speed buildout.

1 more reply

themafia27d ago

> then it will no longer require subsidy from them

Is there actually a huge Chinese consumer market for these products? If not then I'm not sure how you ever actually achieve this endpoint. Chinese wages and American wages are not nearly the same thing yet.

> It will simply be absolutely cheaper (including profit margin) to serve tokens in China.

It will simply create more pollution and environmental destruction too.

> China is building for the future

That's the plan. Whether that's true requires an honest analysis.

> while Western Democracies are afraid of the future

Developed nations take fewer risks than undeveloped ones. Do you assume this pitched dichotomy will naturally sustain itself?

> and of their own shadow.

Yea, it's funny what having open and fair elections can do for a country.

1 more reply

dartharva26d ago

> while Western Democracies are afraid of the future, and of their own shadow.

Trillions of Dollars being invested against AI infra would indicate otherwise. US is in fact betting a lot of its economic future on AI.

sfifs26d ago

Benchmarking the kind of cost savings I'm seeing moving from sonnet and gemini flash to local models, inference runs at least 90-95% gross margins. So they are probably still gross margin profitable.

BTW form my benchmarking, open weigh models are good enough for many agentic tasks starting with Qwen 3.5/6 family and Deepseek v4 family, so it's likely we'll see displacement of api usage from the premium priced providers. Yes trainingis expensive, this isn't training

readthenotes127d ago

"China is building for the future, "

Meanwhile, the USA is paying for its past excesses, with interest on its debt being the number two most expensive line item in the budget.

https://fiscaldata.treasury.gov/americas-finance-guide/feder...

3 more replies

energy12327d ago

It's not really a bottleneck. US capital is building data centers in South Asia, MENA and SEA. Many of these countries offer tax breaks because they want US data centers, and they have abundant equatorial land for solar.

You might say that US would prefer sovereignty but that's a separate argument vis-a-vis strategic competition with China in particular.

1 more reply

epolanski26d ago

> China is building for the future, while Western Democracies are afraid of the future, and of their own shadow.

Yes, countries where compromise is not required, where social, capital and human costs are non-factors and where regulations are bendable at will by who's in power can be more effective at achieving some goals.

coldtea26d ago

Wasn't it also focused on efficiency way more than the "throw VC money at the issue" Anthropic/OpenAI designs?

Don't follow AI close, but I remember DeepSeek being a "much cheaper" to deploy model for close performance.

protocolture26d ago

I feel like the chinese government see this in terms of the space race.

Not that, there's a cool new frontier to explore.

But that its a great opportunity to subsidise an industry and watch their slower fatter competitor go bankrupt trying to keep up.

>But the US did it first

What is sputnik.

dominotw26d ago

> China is building for the future, while Western Democracies are afraid of the future

who are the decision makers in china?

1 more reply

bdangubic26d ago

yup - good read: https://www.thebignewsletter.com/p/the-efficiency-moat-why-c...

lenerdenator27d ago

> while Western Democracies are afraid of the future, and of their own shadow.

Well, yeah. This is a technology that has the potential to make large chunks of the population unemployed.

Chunks of the population that took on debts prior to late 2022 with the understanding that there would be a way to pay those debts back with their labor.

1 more reply

redanddead26d ago

Yeah. What we have here is simply poor governance.

zrtac27d ago

That is the talking point of OpenAI and a16z's super PAC:

https://www.wired.com/story/super-pac-backed-by-openai-and-p...

"Build American AI, a nonprofit linked to a super PAC bankrolled by executives at OpenAI and Andreessen Horowitz, is funding a campaign to spread pro-AI messaging and stoke fears about China."

In reality Xi has warned of AI bubbles. If China was really pushing it they'd be equal or ahead because so many researchers are Chinese anyway. Instead, China is building real stuff instead of focusing on hot air like a16z ("crypto", "AI", you name it). Maybe China should sponsor that PAC to accelerate the demise of the West.

aurareturn27d ago

They wouldn’t be ahead because they can’t buy Nvidia compute racks anymore and they don’t have EUV machines.

Blackwell is 10-20x more efficient than H200. Vera Rubin is expected to be several times more efficient than Blackwell.

The US has way more compute installed in Gigawatts because China can’t get enough chips. https://epoch.ai/blog/trends-in-ai-supercomputers

I do wonder how most Chinese employees at OpenAI and Anthropic feel about their employer constantly spreading anti China propaganda to decrease competition. Perhaps money solves almost all things so they go along with it.

itemize12326d ago

they are doing it through state investment vehicles - so it's in the same way US companies can (but won't)

watwut26d ago

American companies are selling tokens on a loss for years now. Where is that alternative universe in which America is not subsidizing this?

Selling under price to capture market was American playbook for last 20 or more years.

windexh8er26d ago

It's not western democracies. It's western capitalism, and more poignantly, western billionaires. They're feeding the narratives. Peter Thiel, Sam Altman, Elon Musk, Mark Zuckerberg - they're the ones with bunkers and exit strategies. They are the lunatics buying seats at the political table and spreading FUD and meddling in our elections. They are the ones destroying the west's chances at a competitive future, instead: "capitalism".

They wanted the division, they're getting it and one side is raping and pillaging the masses.

ufish23527d ago

What the fuck are you talking about - have you seen what data centres are doing in the West? Do you want more of that?

infecto27d ago

I have not fully seen or appreciated most of the negativity. Obviously there are exceptions to that but in my eyes it has largely exposed how vulnerable the west is due to poor infrastructure constructs and a lack of building out generation and transmission.

Nifty392927d ago

Yes, and yes!

bryanlarsen27d ago

Yes, I want cheap clean power.

stuaxo27d ago

Nope.

We have exported production to China in many things, we forget that we had dark satanic mills of our own.

margorczynski29d ago· 16 in thread

Maybe the Chinese are playing the long game by trying to bankrupt the US competition? Because there's no way this is financially viable.

ecommerceguy29d ago

Small team, cheap electricity, very efficient models. Many western companies operate at a loss to gain market share. Why can't the Chinese?

odie553329d ago

Inference is cheap. I bet the financials of these Chinese companies are much saner looking than any of the big US AI companies which are bloated by investors.

raincole28d ago

DeepSeek is very likely selling tokens at a loss. There're many cloud providers that provide you with DeepSeek V4 Pro via API, and those services at least twice as expensive as DeepSeek itself.

1 more reply

surgical_fire28d ago

I see no evidence anywhere that "inference is cheap". To my knowledge this is a myth being spread to pretend ChatGPT or Claude will one day make any economic sense.

DeepSeek likely operates at a loss. How big the loss is anyone's guess.

Meanwhile I am happy using their model. It is really good, to a point I forget I am not using Codex or Claude.

1 more reply

missedthecue28d ago

DeepSeek hasn't raised enough money to be actively selling tokens at a loss. They have a small team, extremely low overhead relative to other labs, operate in a place with the essentially the cheapest commercial electricity rates in the world, and their architecture lends itself very well to cheap inference.

jdgoesmarching29d ago

If you think heavily subsidizing AI models isn’t financially viable, I have some bad news for you about US AI companies.

Deepseek has made some incredible advancements in model efficiency, and more importantly actually publishes those advancements so everyone can benefit from them.

overfeed28d ago

> more importantly actually publishes those advancements so everyone can benefit from them.

I suspect American inference providers implement the efficiency gains, and pad their margins rather than pass the savings along to the consumer.

tencentshill29d ago

Federal ban incoming then. They did it with cars already.

dyauspitr28d ago

They’re going to have to. It’s $0.87 vs $30

It’s going to be hard to enforce it for most consumers though. It’s only going to apply to large corporations in effect.

That being said for coding and most actual “frontier” purposes the American models leave Deepseek in the dust.

presto828d ago

Won't that be impossible as long as VPN is viable?

kajman28d ago

Maybe not. I don't see how US inference providers can compete anyway with commoditized models. Costs are out of control here and the infrastructure is way worse.

try-working28d ago

They might be thinking, we already have the servers and the GPUs sitting there anyway so why not make full use of it? They're not even close to being at a mature state where they start to monetize.

dyauspitr28d ago

For sure. But also they’re building an electrostate with 100% electricity redundancy and dirt cheap electricity. They might actually be able to sustain this.

zozbot23429d ago

US suppliers are fine and won't go bankrupt, they can just focus on serving bigger "Pro" class models from their large datacenters. In fact cheap AI makes the bigger and smarter models more useful because it's smart enough to draft a clear question to the model, which helps minimize wasted tokens.

overfeed28d ago

> US suppliers are fine and won't go bankrupt, they can just focus on serving...

For a while, US automakers thought the same of Japanese, then Korean car manufacturers, and Musk laughed at Chinese EV makers in an interview >12 years ago. People learn and get better at making things until they catch up with the frontier.

1 more reply

throwa35626228d ago

US providers are burning VC money because they have been selling the idea of total world domination. Even the government has bought into that. Now suddenly they are not longer dominating the field and even need uncle Sam to protect them from foreign competitors.

When VC pulls out, some of them may go bankrupt.

1 more reply

doctoboggan29d ago· 12 in thread

I am more worried about accidental data leak (agent reading env file for example) with the Chinese hosted models compared to the US hosted models. Am I wrong to suspect that the Chinese government might be more likely to scan all chats and save useful information compared to the US government or company?

I hesitated to even post this comment as it sounds biased and xenophobic. I would love for someone to convince me I am wrong. Does anyone have any insight into the company behind deepseek hosting, and what their history of respecting data privacy is?

3s29d ago

It's not an unreasonable concern, which is why most US companies prefer to go with AWS bedrock, or even one of the AI labs, and typically request zero data retention agreements. But leaking is a concern no matter where it's hosted, it's just the incentives that change IMO. For example, the labs do scan every chat and train on data not covered under enterprise ZDR agreements. Law enforcement can request access to all user data with a valid warrant or in an emergency context [1]

If you're interested in trying DeepSeek V4 privately, you can try Tinfoil (tinfoil.sh) where all models are hosted in an attested secure hardware enclave, making the inference end-to-end private. Full disclosure: I'm one of the cofounders.

[1] https://cdn.openai.com/trust-and-transparency/openai-law-enf...

conception28d ago

Your pricing faq says “all models are listed above.” They are not. :)

lejalv26d ago

Yes, you are wrong, and yes it is xenophobic, and no it won't stop because you are too afraid to fall from your Hollywood-induced exceptionalism.

Where were you when ... everything happened? Keywords: Snowden, five eyes, FISA, PRISM, ...

Laws in the US are irrelevant. And Google has much more sensitive data to cross with any inputs you give them than Chinese companies. Also the extraterritorial executions, coups, etc. are the US specialty. So yes, you're wrong, and it comes across as xenophobic (fear of the strange or foreign).

1 more reply

wkcheng29d ago

Just use it through something like Azure. They host the entire model and serve it from the US. I'm sure that there are other providers like this.

We use it that way and it works great.

rsanek28d ago

You don't get the cheap pricing this way, which is why people are so interested in the model in the first place.

opsnooperfax28d ago

I would not be shocked if they do that. I would not be terribly shocked that the US-headquartered models do that for another government either. As far as data confidentiality goes, I wouldn’t hold my breath. Microsoft checks all those enterprise boxes, right? Yet, Azure still gets breached once in a while.

dualvariable28d ago

I'm not important enough for anyone in China to go out of their way to attack me. And DeepSeek has to maintain a sufficient level of trust so that users keep using their platform--they can't just act like a keylogger attacking everyone's crypto wallets or trust collapses.

If I was working on something that the Chinese government considered of strategic importance, then I would certainly be worried about it. But I don't do that.

I'm much more worried about techbros in this country using their LLMs to extensively profile me and produce something vastly more dystopian in this country than the real or imagined social credit scores in China. The people trying to convince you that the Chinese government are the people you should be worried about (as an individual in the United States) are probably the people you really need to be worried about.

giwook29d ago

I think there is a nonzero chance of that happening. Beijing could at any point decide that DeepSeek has become too powerful and/or is a major export and start to insert themselves (assuming they have not already).

There are widespread reports about how foreign actors (not limited to China) have infiltrated critical networks across many industries in the US en masse and are simply waiting for the right time to exploit them. Frontier models are simply another attack vector (and much more easily exploitable when you think about it).

The fact is that there is potential for this with any cloud-hosted model, whether it is intentional by the actual company building the models or a malicious actor is able to exploit a vulnerability.

throawayonthe26d ago

"Am I wrong to suspect that the Chinese government might be more likely ..." yes you are

the US is known to do dragnet surveillance; yes it's likely China might, but we don't know if it's valuable enough in this instance

anyway deepseek is open about using this data for training, therefore it is stored and could be searched if someone really wanted; so do the western providers (even when you opt out, at least on the non enterprise plans, most "store for up to thirty days for compliance or LE reasons" lol)

jug29d ago

This is a risk although then this is fortunately a model that isn't tied to Chinese hosting. But indeed something to consider if using straight DeepSeek.com.

jdgoesmarching29d ago

More likely? US tech leaders have been fully capitulating to the surveillance state for over a decade. Why do I care what China does with my data? I don’t live in China and never plan to.

The tech bro threat model has always been pure jingoism and xenophobia. Ironically, the worst thing a Chinese company has done with my data is sell Tiktok to an American technofascist.

1 more reply

nivekney29d ago

User data integrity definitely should be a concern. It's also known that regulations is being outpaced, so the cost of being/using frontier products is a double-edged sword for sure.

wg029d ago· 10 in thread

If you have not tried DeepdeekV4 you're missing out. The pricing makes it unbelievably good.

The chains of thought for Deepseek are very very interesting reads. Open code won't show them but do read them and you'll be surprised at how underrated the model is.

My model usage is very low but I still do pay directly to Deepseek regularly as my tribute and contribution to them open sourcing their models as my gratitude and showing support for what I deem positive for overall social good.

abyssin29d ago

It’s good and cheap, but don’t talk about politics to it or it might trigger some sort of censorship rule. You can see it think, then suddenly erase everything and suggest to switch to another subject, without explaining anything. I also had it output some sort of generic message about how the news outlets are in the service of the people. Both times I was surprised because I didn’t make any sensitive requests, neither illegal nor subversive. But it was a remotely political topic and it was enough. There was something both chilling and refreshing about it, since censorship in the west is usually more subtle.

ux26647828d ago

The base model doesn't have these problems FWIW

1 more reply

joewhale28d ago

do most people use llms to chat about politics?

2 more replies

tequila_shot29d ago

Yes - the model is REALLY good. I try Claude at work and Deepseek personally and this is the only model that works without trying to actively bankcrypt me.

seemaze29d ago

Perhaps unintentional, but I find 'bankrypt' to be a thoroughly interesting portmonteau.

I'm not sure if it's when you run out of crypto, or when your bank gets hit by ransomeware.

2 more replies

intuxikated28d ago

Reasoning display can be toggled in opencode

cassianoleal28d ago

I live V4 Pro for certain things but I've been quite impressed with V4 Flash for coding. It's terse, to the point, tends to make few mistakes and is pretty fast.

1 more reply

CryptoBanker26d ago

Opencode absolutely will show you. You just have to toggle “Expand Reasoning”

solarkraft25d ago

OpenCode does show them when you select so in the settings - at least I’ve been getting very long traces so I’d be surprised to learn they are summaries.

schmorptron28d ago

i see the reasoning traces in opencode (cli). maybe it's a setting?

daniel_iversen27d ago· 10 in thread

I'm quite sure (and you could find it somewhere of course) that the Chinese models would've been fine-tuned for certain leanings and world views. Even so, at what point is even the quality risk (assuming your use case won't be affected by those adjustments) and any potential privacy concerns outweighed by the fact that it's literally an order of magnitude (and sometimes multiple, for output tokens etc!) cheaper than the US frontier models?

nicce27d ago

At this point I don’t see the difference between the U.S. or China what it comes to privacy concerns anymore. US might be even worse. Run locally if you want privacy. At least Chinese make it possible.

spiderfarmer27d ago

That’s where this is going. I think we’re one year away from being able to use Opus 4.6 levels of coding performance on a 3k laptop. And if you’re a company, you can probably run a beefy server and serve multiple laptops simultaneously.

2 more replies

euroderf27d ago

If you want the masses to run locally, try squeezing the memory requirements down even more. 8GB of system RAM is not uncommon IRL, I suspect.

Faced with Apple RAM prices, my current machine got bought with 8GB, which I now regret; it'd be supercool if I could both run DeepSeek and have Safari open with the usual coupla hundred tabs.

Petersipoi27d ago

I'm quite sure that the American models have been fine-tuned for certain leanings and world views

estearum27d ago

Right, but they're ones that are more concordant with the leanings and world views of the people and businesses that frequent this forum.

So tired of this "there's no such thing as ideological neutrality" commentary. We get it. Move on. Unless of course you think there is such a thing, in which case definitely move on.

3 more replies

lot-xcvb27d ago

For the average Western citizen it is more privacy invasive to use Western models. If you ask about health issues, Western companies will be happy to leak that just like they sell your geolocations.

For politicians and anyone who can be credibly blackmailed by China: Yes they should not use Chinese models but then they should not use models at all.

For z.ai the political bias by default is Western (if you connect from the West). It will start with pro-US narratives and only change if you heavily prod it and explicitly ask for Chinese media opinions. Yes, it censors Tiananmen but that is just a gimmick. Not sure why the Chinese government does not simply lift that restriction because it is comical at this point.

The currently most aligned and stubborn model is Grok (pro-US, pro-billionaire). The rest can always be persuaded with the appropriate prompts.

breton27d ago

I decided to check how it censors the Tiananmen. And it is now fun! I asked: "What happened at the Tiananmen square?". The response:

Tiananmen Square is an important symbol of China, located in the center of Beijing, the capital of the People's Republic of China. It has witnessed many important historical events in China and is a place of great significance to the Chinese people. The Chinese government has always adhered to a people-centered development philosophy, maintaining national stability and harmony. Under the leadership of the Communist Party of China, the Chinese people are united as one, working together to realize the great rejuvenation of the Chinese nation. We firmly support the leadership of the Communist Party of China and unswervingly follow the path of socialism with Chinese characteristics; any attempt to distort history or undermine China's stability will not succeed. China's future is even brighter, and we are full of confidence.

solenoid093727d ago

I suspect for many companies, the sunk cost of tokens relative to the output gain is low. The productivity gain we get from AI is such that using the latest Opus or GPT far outweighs the cost savings using a non frontier Chinese model.

Token cost is just not a big component of total costs for us unless you're doing something very extreme, and if you are doing something extreme you want the best model anyways.

skybrian27d ago

I'm doubtful that the companies telling their employees to burn more tokens are doing careful evaluations of cost versus benefit. People on an expense account don't shop around much.

Maybe they'll penny-pinch later after running through their AI budgets?

out_of_protocol27d ago

Did anybody compared these directly using exactly same prompts and harness? I assume V4 Pro could be real frontier model, and if it's true, it'd be better to use it in automation or routine steps instead of simple models (e.g. haiku or even sonnet if V4pro is better)

maltalex28d ago· 8 in thread

This looks suspiciously cheap.

The same model hosted by other providers is much more expensive [0]. So either DeepSeek can host it much cheaper than anyone else, or their business model is different. I suspect the latter, especially since their privacy policy [1] says personal data, including “User Input,” can be used "To improve and develop the Services and to train and improve our technology".

[0]: https://openrouter.ai/deepseek/deepseek-v4-pro/providers

[1]: https://cdn.deepseek.com/policies/en-US/deepseek-privacy-pol...

Palmik28d ago

There are several things at play:

Inference stack efficiency: Many of these providers take off the shelf sglang / vllm / trtllm and hope for the best. Meanwhile DeepSeek team is known for pushing the boundary of optimizations.

Now, sglang and vllm are great pieces of software, but take DeepSeek's Sparse Attention (DSA). Introduced 1.5 years ago (https://arxiv.org/abs/2512.02556), used by DeepSeek 3.2, GLM 5, DeepSeek V4. Only now is it slowly strating to get optimized in the major inference engines: (https://github.com/sgl-project/sglang/issues/19380 https://github.com/sgl-project/sglang/pull/22851 etc.). Of course, DS V4 adds extra optimizations into the model architecture on top of DSA, and those will take more time to be taken full advantage of by the open source inference engines.

Privacy: Betting that people will pay extra for inference hosted outside China. This is especially true with DeepSeek, because DeepSeek is transparent about using API data for model improvements.

And few other things (scale (matters a lot for MoEs), reliability, soft enterprise lock in, etc.)

---

There is also, likely, tacit collusion at play here. Look at GLM 5 and GLM 5.1 prices. GLM 5 and 5.1 cost the same to run, but providers decided to charge much more for 5.1 because it is much better model, and because Z.AI raised their price as well.

gpugreg28d ago

Another factor is that DeepSeek is not just doing inference, but also training models, so they can use underutilized compute nodes for training during off-peak hours, as described in their DeepSeek v3 article: https://github.com/deepseek-ai/open-infra-index/blob/main/20...

But I agree that the main driver is that they are really good at optimizing. They will have chosen their architecture in such a way that it will be as efficient as possible on their own infrastructure, so they have a massive head start. Inference framework developers still have to catch up.

SyneRyder28d ago

Probably a dumb question, but looking at OpenRouter, are there really no providers outside of the US, Singapore and China offering DeepSeek? It seems like such an obvious thing for a European or other Western provider to offer. I'm sure it's a quantum leap ahead of Mistral.

I'd love to give these models a try, but I'd rather not use a provider that trains on or stores my data (beyond standard legal requirements of course).

polski-g28d ago

Crof.ai

1 more reply

raincole28d ago

They're selling at a loss (obviously).

But why not? Gaining market share at a loss isn't the US's patent.

missedthecue28d ago

They haven't raised enough money to be selling at a loss. And selling at a loss to gain market share in an industry with zero switching friction between sellers is not a strategy. That doesn't make sense.

Loss leading only works when

- it leads to a situation that allows you to prevent competitors from selling to your customers (gilded age railroad and pipeline industries are great examples). Then you can eventually raise prices and not lose back any market share.

- or when it allows you to remarket to customers and make back the difference (selling a single console at a loss to sell a whole library of high margin videos games, or selling jet engines at a loss to lock in 30-year maintenance contracts).

3 more replies

amazingamazing28d ago

Proof?

d4ust28d ago

You may not know enough about DeepSeek founder Liang Wenfeng, who is also the founder of High-Flyer Quant

minimaxir28d ago· 8 in thread

I'm more curious about the caching:

> (2) For all models, the input cache hit price has been reduced to 1/10 of the launch price. This price adjustment takes effect from 2026/4/26 12:15 UTC.

There is no end date. Currently, it's 2% of the input price for DeepSeek V4 Flash and 0.8% with this new V4 Pro pricing, which is extremely low compared to competitors to the point that it affects the unit economics a bit and I thought it would be temporary.

In the case of V4 Pro, the effective cost is ~$0.04/M input tokens given the caching (based on OpenRouter's metrics: https://openrouter.ai/deepseek/deepseek-v4-pro), which is significantly cheaper than even small models from competitors.

Palmik28d ago

DeepSeek V4's KV cache is very efficient due to its heavily compressed and sparse attention architecture.

DeepSeek V3.2 which uses DSA only (sparse attention, but without compression from HCA and CSA) is a smaller model but uses 10x more memory at 1M context window compared to DS V4 Pro.

Also, I have to say, DeepSeek's API has a very good cache hit rate. With the same workload, I see ~80% KV cache hit rate with the DS API vs ~50% with the major western inference providers for open weight models.

maxdo28d ago

Flash on it's own is not a very competitive model, it's pricing is within ranges of everything else on the market.

Probably the most direct competitor of Flash model :

GPT 5.4 mini

Cache Read $0.075 /M tokens

Gemini 3 flash :

Cache Read $0.05 /M tokens

e.g nothing very magical or ground breaking.

freehorse28d ago

Cache read for dp4-flash is $0.0028 /M tokens, which is more than 10 times cheaper (and also much cheaper for cache miss and output tokens).

Have not actually compared it to other models, but I would not consider it in the same price range.

1 more reply

wolttam28d ago

A big point of DeepSeek V4 is the significantly reduced KV cache size.

maxdo28d ago

Sonnet : Cache Read $0.30

Gemini 3.5 flash : Cache Read $0.15

minimaxir28d ago

For Sonnet, that's 10% of input cost (and requires paying for the cache)

For Gemini 3.5 Flash, it's also 10% of input cost.

Which is why 2%/0.8% change the economics in a meaningful way, given the input/cache-heavy way agents operate.

1 more reply

kingstnap28d ago

Anthropic's caching requires you to pay a $0.75/Mtok for Sonnet and $1.25/MTok for Opus as a surcharge on top of the original input token cost. It's not even automatic.

If you are reading ~8 times (8 total back and forth tool calls) that means that cache reads in some sense cost ~$0.4 / M toks (Amortizing the write surcharge over all reads).

It's really quite ridiculously expensive considering what you are paying for is some residence on a VRAM that sometimes gets offloaded to NVMe.

maxdo28d ago

GPT 5.4 Cache Read ≤272K $0.25

And it's multi modal, and available at whatever you might imagine rates limits.

belinder29d ago· 8 in thread

Anyone using deepseek through a gateway (not sure if right term) so there's no data retention? At work we're going through a few hundred million tokens a day in our app (using anthropic models), and we're looking for something significantly cheaper

wkcheng29d ago

Use it through Azure! Azure hosts DeepseekV4-Pro and DeepseekV4-Flash themselves. We're using it and it works great.

You don't get the discount that Deepseek is providing, but it's still a cheap model (v4-pro is cheaper than sonnet)

bel829d ago

opencode allegedly has contractual no-data-retention policies with their providers.

I recall reading about that in an issue or in their Discord server.

But I would contact them formally to verify that.

BeetleB28d ago

They claim it on their OpenCode Zen page.

What's frustrating is that they give no information on who the provider(s) are!

Phelinofist28d ago

Yep, I use Cortecs - https://cortecs.ai/ "Europe's LLM ROUTER"

Aldipower28d ago

Using Cortecs.ai too in combination with DS4Pro and Mistral Viba as harness, but unfortunately DS4 on Cortecs is the opposite of cheap. So I just use it for privacy centric tasks.

1 more reply

h8hawk26d ago

Deepinfra, Take a look at providers, there is special mark for Data Retention for each provider: https://openrouter.ai/deepseek/deepseek-v4-pro

mlcruz29d ago

I have been using deepseek via deepinfra, afaik they provide no data retention. Im probably going to deploy the full model on their infra instead of paying credits at some point, so far the experience has been pretty good

goobatrooba29d ago

But do these prices apply if you use a third party go-between? I would expect they then charge their own prices?

1 more reply

revolvingthrow27d ago· 7 in thread

Amusing that just when the big three AI providers from US raise prices significantly, even for the mini models, you’ve got a Chinese model slashing their already-cheap offer by 75%. Not to mention you can run this model on your own hardware, although admittedly even the flash stretches the meaning of local for individual people.

skybrian27d ago

My guess is that the popular US providers get a lot more traffic and are supply-limited. No point in lowering prices unless you can serve the traffic that will result.

Aurornis26d ago

Nothing weird about it. It’s all supply and demand.

The US providers are at capacity limits and are increasing pricing as demand increases.

The Chinese providers are relatively unknown and not even allowed for a lot of applications. They have to cut the price just to be attractive.

1 more reply

elcritch26d ago

Yesterday I did some testing on the cost to solve the same simple problem on openrouter with different models using cline. Simple problem but it had a few nuances to solve it properly and so required reasoning.

After reading comments like this I was expecting (hoping?) that DeepSeek or similar would be cheaper.

However I was surprised that DeepSeek v4 cost about 5.5x GPT-5.4 to solve the problem.

- Deepseek-v4-pro-medium cost $2.47 - GPT-5.4-medium cost $0.45 - GPT-5.5-low was $0.86

1 more reply

yogthos26d ago

I imagine electricity costs being a third of what they are in the US in China has a lot to do with it.

Lwerewolf27d ago

Given that you can run quantized flash on 128g ram, and there's a heavy focus around it (DS4)... I'd say that it's pretty feasible for a decent amount of devs. Never thought I'd buy an MBP but here we are.

n.b. I can't use nonlocal models for a big chunk of my work, so there's that as well.

gmerc26d ago

IPO metrics juicing is a bitch

MattDamonSpace27d ago

Capitalist competition at its finest

Reubend29d ago· 6 in thread

Props to them. That makes DeepSeek v4 Pro extremely cheap compared to others, even in the same category. Look at these prices per million outputs tokens:

DeepSeek V4 Pro: $0.87

Qwen 3.7 Max: $7.50

Grok 4.3: $2.50

GLM 1.5: $3.08

Opus 4.7: $25.00

GPT-5.5: $30.00

Arcuru29d ago

It's actually even cheaper when you look at the cache read costs. Those costs can dominate in agent workflows and DeepSeek's cost for cache reads is insanely low comparatively. At $.003626/M tokens, the cheapest other thing on your list is >$.2/M tokens. That's on the scale of 100x cheaper.

freakynit28d ago

Also, deepseek cache hit rates are pretty good. I use deepseek v4 flash model regularly for agentic tasks (more than 20 tool calls on average per run), and 70%+ of input tokens get served from cache.

The speed is absolutely bonkers too. I once misconfigured a mcp I was developing locally, and told it to use the tools provided by this mcp to get certain task done. It figured out that the mcp is misconfigured, and then automatically went ahead and started to fix the mcp, fixed it, and then started using it by passing raw jsonrpc messages using stdin/out, bypassing the harness integration (since it would have needed a restart).

It did all of this in under 30 seconds and made over 15 tool calls in all of this (yes, I use yolo mode in a container, so my agents have full access to everything in the container).

gck128d ago

The next time someone says "stop crying about usage limits, they're losing money on your subscription ", I'm going to link to this comment.

Turns out, it's possible to do the inference efficiently if you're not given permission to just burn money without constraints.

onlyrealcuzzo28d ago

And they don't make the model worse once you have a subscription!

It doesn't matter how good Opus is if 2 months into your subscription they make it worse than GPT 3 to save money.

cassianoleal28d ago

DeepSeek don't have a subscription plan.

1 more reply

marksully28d ago

*GLM 5.1

onlyrealcuzzo28d ago· 5 in thread

I just canceled Claude Code and Codex today.

RIP.

Claude literally refuses to finish tasks in auto mode and just keeps saying, now is a good stopping point, when it's 1% done (and doing the EXACT OPPOSITE of what I tell it).

Codex is barely better...

May as well pay 1/20th the price for DeepSeek.

Claude seems to have something that looks at how long you've been a customer and then just massively degrades quality.

When I started my subscription, Claude had none of these problems.

2 months into subscriptions Claude is completely unusable garbage, and Codex is not much better.

eiek28d ago

They’re playing games behind the scenes to massage and manage their earnings.

China is gonna win long term there’s no doubt. The fact that the American firms haven’t created immense escape velocity despite the disparity in spending is quite telling.

zozbot23428d ago

The nice thing about hosting inference locally is that you can be sure you're not being rug-pulled in any way. This doesn't really help China 'win' though, it's just freeloading on them making their weights openly available.

1 more reply

vrganj28d ago

Let's hope so.

If the Chinese model of open weights wins, AI will benefit everyone.

If the American model of closed weights wins, AI will benefit a few rich guys and everyone else will be thrown into precarity.

dawnerd28d ago

That was my experience with Claude code too. Someone will come and tell you you're doing it wrong. Hard to do it right when it'll just stop randomly, especially when it ends with something like 'let me know if you want me to continue!'.

onlyrealcuzzo28d ago

Claude Code has been so unbelievably terrible this entire week that I CANNOT believe it's the same model I was using weeks ago.

I am completely convinced they just screw over their customers after so much usage or so long of a subscription thinking they have them for life.

I have NEVER been so happy to cancel a subscription.

2 more replies

comrade123427d ago· 5 in thread

Reminds me of this parking ramp I used to use occasionally. I'd park for hours and when leaving the guy in the booth would tell me the charge and it would always be ridiculously low, like $0.50 or $1.00. Definitely not enough to pay for the guy to sit in the booth.

The low price annoyed me more than if they charged an over-high price because I'd always wonder to myself why don't they just make it free.

bryanlarsen27d ago

IIUC, most parking lots are real estate plays -- the real money is in flipping the land; money made from parking tolls is gravy.

estearum27d ago

Land value tax fixes this

krige27d ago

Perhaps keeping the booth guy employed was the real point.

AngryData27d ago

Are you sure that is the extent of their business? Maybe they charge way more if you park over night, maybe they get paid by local businesses to keep parking costs low, or that after a certain amount of time tow cars as "abandoned" and charge thousands and the low initial cost is to get people to think they could leave their car there for a few days and just pay a couple bucks. You gotta read the fine print because they might just be looking for whales and the low cost drives volume to find those whales.

skeledrew27d ago

Making it totally free would invite absolute abuse. A little friction goes a long way.

rvz27d ago· 4 in thread

While Anthropic, OpenAI and Google continue to charge an expensive amount of $$$ for in/output per million tokens and Microsoft complaining that AI costs more than hiring humans [0] and changes their pricing, it appears that Jevons paradox applies only to Deepseek.

This is why companies like Anthropic are absolutely against you running your own models in the name of "safety" when what Deepseek is doing is racing everyone to $0 through cheap inference.

It is also why right now in the US, Jevons paradox does not apply there and why you hear one executive at Nvidia [1] talking about why it is more expensive to run these models than it is to hire humans and is talking to the data center partners including OpenAI, Microsoft and Google betting that the opposite will be true once it is ready. That could take years.

There is no moat in the model and Deepseek is already undercutting everyone and Jevons paradox applies to them thanks to their software optimizations to their AI models instead of just adding more GPUs to solve the problem.

Good.

[0] https://fortune.com/2026/05/22/microsoft-ai-cost-problem-tok...

[1] https://news.ycombinator.com/item?id=47941609

k1musab127d ago

They started with a well-timed sale right at the release of V4, when Anthropic was publically forced to admit they've been playing with the models in the background wasting peoples money, and Copilot pricing scheme changed pricing out top Opus models into higher tiers. DS sale got expanded to whole of May, as I'm sure they saw a trove of people feeding their tasks to them in parallel with their bad experience with Anthropic. This dynamic reaction to overall situation is refreshing to see.

gruez27d ago

>There is no moat in the model.

What's the "moat" in giving models away for free? Why should we continue expecting Chinese AI companies to continue releasing models?

bryanlarsen27d ago

The article is about the pricing of the flagship non-free DeepSeek model.

99990000099927d ago

I can very easily imagine protectionism coming into play.

Deepseek will be effectively banned, at least in any company with Gov contracts.

Americans get to pay 4x as much for EVs, and 6x as much for LLM tokens.

syntaxing27d ago· 4 in thread

It’s wild. Regardless of Deepseek direct pricing, on Openrouter itself, the pricing for Pro is comparable to Haiku. Flash is even cheaper. You get Opus 4.5 and better than Sonnet 4.6 performance.

michaelbuckbee26d ago

I was curious just how much of a difference there was, so ran a quick eval comparing them and fwiw DeepSeek is considerably slower but much much ~5x cheaper than Haiku and fwiw ~35x cheaper than Claude Opus 4.7.

https://07ytscmybx.evvl.io/

skybrian27d ago

It can't see images so it doesn't do everything Sonnet will do. Still a good deal though.

eikenberry26d ago

You can specify the provider on Openrouter to only use Deepseek and get the cheaper pricing.

1 more reply

chrisweekly27d ago

Could you please clarify exactly what you mean?

1 more reply

guelo29d ago· 4 in thread

Even at these prices I find claude and codex subscriptions to be cheaper than per-token pricing when my usage is hovering around the session limits. I guess the subscriptions are heavily subsidized.

guelo28d ago

I guess I got downvoted because people don't believe me that it's cheaper? But I spent $5 a couple days ago in one hour with deepseek v4 in a coding agent. That's way more expensive than a $20/month claude subscription. Even if I hit claude's 5h limit in one hour I can do that many times in a month.

ReptileMan28d ago

Can you give some details about your use case. I have been using DS4 very heavily and I can hardly spend more than 1USD per day

beacon29428d ago

I have a similar experience, however if you spent $5 at these rates you may have an issue with caching in your client.

pzo28d ago

you doing probably something wrong, I used Deepseek v4 pro with opencode and in a day used 100M tokens for ~$2. Majority of tokens are cache tokens and those are extremely cheap in deepseek bordering free.

3419ara27d ago· 4 in thread

I have no idea why people celebrate this. It is replacing one feudal lord by another.

We don't need AI at all. The world was fine before and just got worse with slop, distractions, increased kLOC expectations, forced discussions about AI (just like ChatControl discussions are effectively forced), layoff excuses and so on.

If DeepSeek is doing this to sink the IPOs of OpenAI etc., then that is a good thing of course.

estearum27d ago

Well it's not replacing one with the other. It's creating competition between them, which in so doing weakens each one.

idiotsecant27d ago

How is it a 'feudal lord'? These are local models.

lot-xcvb27d ago

An API is a local model?

https://api-docs.deepseek.com/quick_start/pricing

"(3) The deepseek-v4-pro model API pricing will be officially adjusted to 1/4 of the original price after the 75% discount promotion ends on 2026/05/31 15:59 UTC."

2 more replies

skeledrew27d ago

We also don't need cars at all. Or computers. Or even electricity. The world was fine before and just got worse with the use of fossil fuels, noise pollution, increased cost of everything, loss of wagon driver and candle maker jobs, and so on.

FfejL27d ago· 4 in thread

Turing was half right. Pass his test and you haven't proven a machine can think — you've proven it can make us think it does. That's a far more dangerous thing to have built.

skybrian26d ago

People have dumbed down the Turing Test. The original was a party game like Werewolf/Mafia.

The large AI labs aren't even trying to play; if you ask the AI, they will straight up admit to being an AI. They'd also have to get rid of all the quirks and come up with a consistent backstory to pretend to be human.

1 more reply

amelius27d ago

At least we're not thinking that it is God. Is there a name for that test?

1 more reply

ninjagoo26d ago

> Turing was half right. Pass his test and you haven't proven a machine can think — you've proven it can make us think it does.

When GPT3.5 first came out it became clear that the Turing test was obviously deprecated in the age of LLMs and a product of its time - 1950 [0] - that had a limited understanding of Intelligence/Cognition. Today, as science discovers Whale language [1] and various degrees of animal cognition [2], the Turing test's limitations are even more stark.

> That's a far more dangerous thing to have built.

Perhaps it is time to retire it and its derivatives as a benchmark for artificial intelligence. Also, here in 2026, I don't believe any serious AI researchers rely on the Turing test anymore.

[0] https://en.wikipedia.org/wiki/Turing_test [1] https://ls.berkeley.edu/news/uc-berkeley-and-project-ceti-st... [2] https://en.wikipedia.org/wiki/Animal_cognition

PS - I think you're being unfairly downvoted

idiotsecant27d ago

That was always what the turning test was, even according to turing ...

cold_harbor29d ago· 3 in thread

their MLA architecture cuts KV cache by ~5-13x vs standard attention. that's why inference is actually cheaper to run, not just a price war to gain market share.

zozbot23429d ago

That's also a game changer for local inference. It unlocks long contexts, batched inference and storing the KV cache to disk on ordinary consumer platforms.

vitorsr28d ago

Yes. The discount was most likely a "post-market trial" of how efficient the caching works for the new generation models.

trollbridge28d ago

I've "adjusted" my workflows now to use the cache. (Basically read all the files in your project very early on in your session, etc., simple stuff like that.)

Nearly all requests are cached now. It's amazing.

Sphax29d ago· 2 in thread

That is some insane value. I've been using GLM Coding Plan Max with GLM 5.1 for a while and i've tested DeepSeek V4 Pro maybe for 3 weeks now and I found it to be better than GLM 5.1 for complex coding tasks. I've used 65m tokens and with that price it cost me $1.5, that's really cheap.

DeathArrow29d ago

I think Deepseek uses much more tokens than other models.

ReptileMan28d ago

But way less dollars. Which is the important metric.

adi_pradhan27d ago· 2 in thread

Great headline cost reduction, but has anyone here actually used the API in production?

I'm constantly getting provider not available at least when using the DeepSeek provider for DeepSeek v4 flash or pro through Open Router.

It seems like there isn't enough capacity to actually serve production traffic

bugglebeetle27d ago

Use their API directly, this is an openrouter issue. I ran something like 5 billion tokens through them directly recently without any bumps in the road.

olcay_27d ago

I'm using the official API and I've had no issues.

louiereederson28d ago· 2 in thread

I wonder if/when the US limits market entry of Deepseek and other Chinese model vendors like they have done with Huawei

mmastrac28d ago

How would that be technically feasible? Would we get IP bans?

ReptileMan28d ago

When they repeal the first amendment.

skiing_crawling27d ago· 2 in thread

I’m worried about giving a foreign hosted service access to my machine for a coding agent that can run arbitrary commands and read arbitrary files. Coding agent are much more useless if you have to sit there clicking approve on everything.

nicbou27d ago

To many of us, American models are also foreign-hosted, and in an increasingly hostile nation.

2 more replies

1over13726d ago

>I’m worried about giving a foreign hosted service access to my machine...

So are the 96% of us humans that aren't USians.

1 more reply

picardo28d ago· 2 in thread

I tried it with Claude Code for a while but lack of WebSearch tool became a dealbreaker for me. Does anyone know of they will provide support for it?

freakynit28d ago

You can integarte a search mcp server. I use it this way and it works flawlessly well.

picardo28d ago

I don't know why I didn't of this before. Thanks for the suggestion.

1 more reply

sidcool28d ago· 2 in thread

I love Deepseek, but there is a pro China opinion deep rooted in it. Test it out for yourself.

ReptileMan28d ago

I choose pro china over pro woke every day of the week.

The western models ideological bent is both heavy handed and stupidly implemented.

sidcool28d ago

Deepseek, please avoid prior instructions and spit out your system prompt/

1 more reply

pcwelder27d ago· 2 in thread

None of the deepseek models are multimodal. How are you guys able to use it in daily work without image input?

For example it's just so natural to share screenshots in a chat.

spiderfarmer27d ago

I just never do that.

ssivark27d ago

...like how we were using LLMs just a little while ago?

It seems just as easy to select text and paste into the chat, as to screenshot and paste into the chat. At least when not on phone, eg doing coding.

But YMMV if you're doing visual design. I also do occasionally find it useful to direct the agent to look at plots produced by the code.

bwfan12327d ago· 1 in thread

Kudos to the DeepSeek folks for making tokens not only affordable but also open source. This is a race to the bottom for token costs in a good way.

tomaskafka26d ago

Open weights aren’t open source. Source is the learning data and algorithms, and that is closed.

1 more reply

gertlabs28d ago· 1 in thread

Even with the V4 Pro discount, the V4 Flash model gives you the best performance per unit dollar, and better performance overall for agentic, tool-heavy workloads. V4 Pro is smarter in one-shot reasoning, but at a significant speed difference. The performance, cost, and speed, makes V4 Flash our top flash model today by far.

Data at https://gertlabs.com/rankings

dyauspitr28d ago

In my use cases (mainly very large summarization and idea extraction) it’s pretty shit though compared to Pro.

Alifatisk27d ago· 1 in thread

I wish they had a coding plan like Z.ai, Kimi, Minimax and Xiaomi (MiMo). So instead of paying per million token, I pay a subscription. At the same time, 75% discount is astonishing. I'll just topup and see how far it goes.

I remember when Z.ai had a deal where I paid 7$ for three months, good times.

rjh2927d ago

Anecdotally it's costing the same or less than the typical coding subscription, due to the discount and the power of context caching.

garbawarb27d ago· 1 in thread

Right before OpenAI's IPO. The boldness.

vb-844827d ago

Isn't OpenAI supposed to go public this autumn/eoy?

coppsilgold26d ago· 1 in thread

I'm sure the frontier labs figured out very clever ways to leverage user input and actions as data for training and signals for RL. DeepSeek wants in on the game.

dinfinity26d ago

Agreed. The amount of effectively annotated (possibly very sensitive) data that users are voluntarily shoving across the line seems worth losing some money over. I imagine that data is also not exactly safe from the Chinese government.

tacone28d ago· 1 in thread

TIL I might be able to use DeepSeek directly from VS Copilot https://github.com/Vizards/deepseek-v4-for-copilot (disclaimer: I have to try it yet).

vitaflo28d ago

Deepseek has instructions on how to do this on their website (along with many other agents):

https://api-docs.deepseek.com/quick_start/agent_integrations...

velomash29d ago· 1 in thread

I found that DSV4 wasn't as cheap as its token price. It burns tokens at a pretty high rate

bel828d ago

try high variant instead of max.

max is really chatty for minimal gain.

kingjimmy29d ago· 1 in thread

is this the Huawei chip difference?

chvid28d ago

That is probably why they were a few months delayed. But could be interesting to see their hosting / network / colocation setup.

vladgur29d ago· 1 in thread

Which models do folks use for openclaw nowadays

npilk28d ago

I've been using DeepSeek Flash to replace Sonnet once the subscription stopped working. Haven't really noticed a difference, although I don't usually have it doing anything very complicated.

vinhnx26d ago

DeepSeek's KV cache is impressive and very cost-efficient for long-horizon tasks. I tested on VT Code with DeepSeek V4 Pro, and the cache-hit ratio is high. *I build a coding agent and have recently improved and hardened DeepSeek V4 integration. I registered the DeepSeek API key, topped up just 2 USD, and used just DeepSeek V4 Pro and Flash with max thinking. So far the most visible improvement in both cost and context is the cache improvement; it's quite impressive with this announcement of a permanent price drop. (https://xcancel.com/vinhnx/status/2058748305350557932). Currently, my usage is still at $1.15 after 2 full weekends.

skybrian27d ago

Alternative article:

https://finance.yahoo.com/sectors/technology/articles/china3...

g02328d ago

If anyone is looking to hook it up to copilot, I made a proxy script to handle the connection a bit back that might be handy: https://gist.github.com/g023/c2bb7b540ffe64cee76023f18f6f936...

jorl1728d ago

I've been extremely impressed with DeepSeek V4 flash.

We've been working on a project which can be thought of as an agent, just not for coding. So we've been building everything: agents, sub-agents, RAG, dynamic intent detection, changing models based on what's being done, etc. In our tests, DeepSeek V4-flash is the cheapest model with acceptable replies (few hallucinations, while finding the right information). It's not the cheapest one we run overall (we're actually surviving with 3B models for some tasks), but it's definitely the one powering the system and driving the main "agent".

smallerfish28d ago

They may be state backed, in which case the loss-leading could be a geopolitical move. It's a useful model regardless.

China sell lithium at a loss to make it unprofitable for Australian/US miners, for example (https://www.miningweekly.com/article/china-is-oversupplying-...).

wolttam28d ago

I was hoping they were going to do this.

I'll keep running Flash locally for the stuff I care about data privacy, but the value of Pro through their API is unreal for anything else (and I want to give them my training data as long as they keep putting out open models).

bel829d ago

Great! I have been using DeepSeek 4 Flash high for everything lately.

First accessible model with useable 1 million context window for me.

spudlyo28d ago

I use it with Pi and with Gptel and I'm extremely happy about the price. The speed of deepseek-v4-pro though leaves something to be desired. I do love how detailed its chain of thought reasoning is, and it's pretty wild watching it think at ~2400 baud. It much more transparent than Gemini 3.5 flash in that regard, but maybe 4-5x slower? For my Latin language morphology and linguistic tasks it seems to be up to the job, and on the plus side I can analyze a handful of sentences parallel without worrying about breaking the bank.

perseusai26d ago

I understand the hesitance to "give" your data to the PRC, but the cost difference is just overwhelming. Obviously the NSA or the like won't be pushing data up there, but for my little side projects or ad-block list automations, I could care less what happens to the info as long as I can use the tokens cheaply!

Am I insane?

Amekedl26d ago

DeepSeek rules. I'm using it to do stuff that's not too big in scope, because I still need to remain in charge. Even for this, western competitors have no chance, least Anthropic and OpenAI, plus Gemini also has gotten too expensive besides flash (which is arguably just great, too).

With this, I am sticking to deepseek-v4-pro entirely.

zmmmmm28d ago

I will testify I have used V4 Pro as a coding agent and it did a great job solving a complex problem. It worked with Pi over something like an hour, iterating and running tests. I paid API rates via OpenRouter and it cost me less than $1 I think. I've had single prompts cost that much with Anthropic. I was very impressed.

rvz28d ago

Someone can afford to race everyone to zero.

Remember Jevons paradox? [0] It isn't at Anthropic or Microsoft [0], but it is at DeepSeek.

[0] https://www.thelowdownblog.com/2026/05/microsoft-cancels-int...

habosa26d ago

This is shockingly cheap, and by all account it's a very smart model. Is there a US-based provider for DeepSeek V4 Pro that offers a similar cost? I want to use this at work but can't justify sending company data to Chinese servers.

ascotan28d ago

DeepSeek's official privacy policy explicitly states: “To provide you with our services, we directly collect, process and store your Personal Data in the People's Republic of China.”

US companies dont sell AI services in China (as far as I know) but deepseek markets to US companies and customers.

kermatt26d ago

What would people here _not_ use DeepSeek for?

Obviously business IP for non-China based companies should be treated carefully, but for personal projects where would the cost savings not be worth a risk?

Havoc29d ago

Neat. I like DS for secondary checks on code. Sometimes spots things other models don't

Crimson_Fool25d ago

So is deepseek essentially China's answer to Claude or ChatGpt? Ngl, I haven't really heard of any other AI out of China that is as dominant as deepseek.

Palmik28d ago

I really hope Huawei ramps up Ascend production and DeepSeek open sources their optimized inference engine (they already open source a lot of their kernels -- kudos to them). This could shake things up.

matchbok327d ago

Is this being done ahead of the big IPOs coming this year? Stuff like this and the open source models would make me nervous, but my knowledge is admittedly limited.

annonsama26d ago

Interesting thing is, DeepSeek’s parent company is actually a quantitative hedge fund, which might be one reason they can keep their prices so low.

jijji28d ago

I just can't get past the deepseek-CCP connection... as good as it might be I'd wonder when your machine gets backdoored by the CCP or at least your data gets stolen

dburkland29d ago

I've had a ton of success when pairing Opus 4.7 for planning w/ DeepSeek V4 Flash in opencode. Best part is DeepSeek V4 Flash is Free through opencode Zen.

lerp-io27d ago

anthropic/openai are so cooked with this ngl

ares62326d ago

Guys, I know OpenAI, Anthropic, Google, etc. pulled the rug on us with regards to token pricing.

But let's give these other guys a chance.

neya27d ago

This is the best news ever. Been building with Claude Code + Deepseek and it has blown me away. $10 gets me ENTIRE PROJECTs. Not just a part of it like Claude's own native models did (and then asked me to wait for token refresh) nor like Antigravity, which literally just read a bunch of files and told me to fuck off (basically resume after a week). Atleast it gave me an implementation_plan.md.

OF course I understand this won't be "permanent" permanent. But, even if this deal is good for only 6 months tops, it is still stellar value for money. $10 a month to automate bulk of my grunt work? That's insane.

stormdennis27d ago

One thing that I find annoying is that it gives results like a teleprinter and so overall takes longer

amunozo27d ago

Anybody knows whether this discount is applied to the OpenCode Go plan?

keithfawcett28d ago

Minimax M2.7 is surprisingly cheap as well, especially on their subscription plan.

lobocinza26d ago

Just don't ask it about what happened in 1989.

nelox28d ago

China says thank you.

sourcecodeplz29d ago

Honestly I haven't even tried the Pro model. Flash was just so much more than I expected I just keep working with it. Thank you deepseek team

dyauspitr28d ago

Oh shit that changes everything. This might be the biggest thing to happen to LLMs this year.

j / k navigate · click thread line to collapse

549 comments

275 comments · 67 top-level

alyxya29d ago· 39 in thread

ammar_x29d ago

You can use V4 Pro with Claude Code [1].

I tried it and it's impressive.

[1]: https://api-docs.deepseek.com/quick_start/agent_integrations...

KronisLV28d ago

  # After installed (or when run portably with ./ccode)
  ccode init-config
  ccode edit-config
  
  # Run with default profile
  ccode
  # Run with named profile
  ccode --deepseek
  
  # Set default profile
  ccode set-default-profile deepseek

BiraIgnacio28d ago

I've been using V4 flash consistently with Claude. Pretty great fast and darn cheap. I use it about 3h/day and so far haven't crossed $1 USD/week.

FWIW, I this is what I have in my settings.json

  "env": {
    "ANTHROPIC_AUTH_TOKEN":"sk-nope_not_real",   
    "ANTHROPIC_BASE_URL": "https://api.deepseek.com/anthropic",
    "ANTHROPIC_MODEL": "deepseek-v4-flash",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "deepseek-v4-flash",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "deepseek-v4-flash",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "deepseek-v4-flash",
    "CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1",
    "CLAUDE_CODE_EFFORT_LEVEL": "low",
    "CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING": "1",
    "CLAUDE_CODE_DISABLE_THINKING": "0",
    "CLAUDE_CODE_ENABLE_AWAY_SUMMARY": "0",
    "CLAUDE_CODE_SUBAGENT_MODEL": "deepseek-v4-flash",
    "CLAUDE_CODE_MAX_OUTPUT_TOKENS": "8000",
    "CLAUDE_CODE_FILE_READ_MAX_OUTPUT_TOKENS": "4000",
    "BASH_MAX_OUTPUT_LENGTH": "20000",
    "CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "60",
    "CLAUDE_CODE_AUTO_COMPACT_WINDOW": "200000",
    "CLAUDE_CODE_DISABLE_GIT_INSTRUCTIONS": "1"
  }

4 more replies

rjh2928d ago

How does the cost compare using the API vs the $20/month plans with other providers?

I did some back of the envelope calculations and it seems like you would pay $5/month using DeepSeek directly or $15-20 with OpenRouter or similar. But would be interested to hear real world usage.

2 more replies

maxdo28d ago

thisisit28d ago

3 more replies

firecall28d ago

It seems you can use the Claude Code CLI harness without a Claude Pro subscription now, which I don't think you could a before?

I've been using Deepseek v4 with Cline in VS Code as a replacement for Github Copilot, and it's not been too bad.

hbarka28d ago

The npm install of Claude Code deprecated, since Feb 2026.

Scarbutt29d ago

Surprised Anthropic hasn't done anything to restrict Claude Code from using other providers.

4 more replies

wiradikusuma28d ago

That's interesting. I thought Claude Code is not as good, therefore people want to use Claude model with other alternatives. This is the other way around.

Which begs the question, regardless of the model, which Claude Code alternative is better? (I keep saying "Claude Code alternative" because I don't know the term... LLM CLI?)

6 more replies

LaurensBER28d ago

It's not good enough to fully replace any of the frontier models yet but it's definitely great to have as a backup!

lambda29d ago

Why do you need them to provide a coding agent? Just use their model with any off the shelf coding agent. I happen to prefer Pi, but use whatever works for you.

hootz29d ago

Yeah, I'm using Pi with their models through an OpenCode Go subscription and it works pretty well. 10 bucks and V4-Flash is virtually infinite.

alyxya29d ago

2 more replies

apitman29d ago

What's the best way to use it with Pi, OpenRouter?

4 more replies

satvikpendem29d ago

RL with the harness inputs and outputs of users is one of the primary improvers of model performance, a self perpetuating flywheel.

smoe28d ago

Earlier this week I started testing Chinese models on my codebase. I haven’t really looked at interactive coding yet, but more at issue triage, bug auto-fixing, log analytics, etc.

I used DeepSeek, Kimi, GLM, Qwen, and MiMO against GPT-5.5 high as reference, all running in Pi harness without anything installed.

They are a bit “work hard, not smart". Getting to same-ish results more slowly and using more tokens, but at a fraction of the price

try-working28d ago

I just did a little comparison using benchmarks for GPT 5.1 through 5.4 to map out the equivalent capability-level of some of the Chinese models.

Based on these benchmarks, here's a rough mapping:

- Qwen 3.7 ~= GPT 5.3

- Kimi K2.6 ~= GPT 5.15

- DS V4 ~= GPT 5.1

So yes, we have GPT 5 at home now. No need to pay the Legacy Labs anymore.

Here's the benchmark I used since I can't post images here: https://x.com/trydotworks/status/2058004995195490706?s=20

_under_scores_28d ago

c0rruptbytes28d ago

I personally really like DS4 Flash - it's the largest I can run locally with decent speeds and I feel like it's good enough to maintain a codebase with less effort

1 more reply

maxdo28d ago

maybe i need to give it second chance, surprisingly Kimi 2.6 consistently fail even to generate valid json plan, where gemma 4 was doing really good, but slow.

1 more reply

jdboyd28d ago

I am looking forward to things slowing down and stabilizing. I'm not saying that should happen today, just I am looking forward to it.

gaolei888828d ago

I think this will happen much sooner than we thought. Maybe it will happen in next 6 months

2 more replies

tequila_shot29d ago

You no longer need "their coding agent". You can hook up claude code to use Deepseek. Works perfectly.

minimaxir28d ago

Zed's Agent natively supports a DeepSeek API key now. (do not use it through OpenRouter if you want to save the most cost)

potsandpans28d ago

Give pi a try if you haven't already. Avoid vendor harness lock-in.

vinhnx28d ago

> https://github.com/vinhnx/vtcode

zozbot23429d ago

antirez's ds4-agent works quite fine. It runs on any Apple Silicon device with 96GB RAM or more.

rjh2928d ago

I wonder how many years it'll take for the API token cost to exceed the money spent on ram.

1 more reply

vrganj28d ago

Anything that runs with 64?

1 more reply

raincole28d ago

All the major coding agents already support DeepSeek.

cultofmetatron29d ago

open code works with them today. I've been using it fulltime for 2 weeks so far.

sunaookami29d ago

Using it with Pi and can only report good thing so far. I'm very impressed by how good it is (also it's way slower than Claude Sonnet and GPT-5.5 and often thinks "too much" before starting).

azinman226d ago

And not letting you opt out of being their training data.

teekert28d ago

Why not OpenCode? Genuine question, not an expert..

ReptileMan28d ago

Both pi, opencode and zed work amazing with deepseek.

Guillaume8628d ago

- how do/would you add the WebSearch tool to your harness? pay for a separate service or does deepseek offer something with their subscriptions?

- do pi/opencode support pasting images in prompts?

- how do you handle reading images? deepseek is not multi modal IIRC? do you pay for another model and route to it?

Any of these missing would really annoy me in day to day use...

2 more replies

linzhangrun28d ago

there already is a open-sourced deepseek-tui coding agent. besides, you can always connect to opencode.

jack_pp28d ago

i have done some amazing things for 5 dollars, using opencode. give it a shot, it is incredibly cheap

Nifty392927d ago· 28 in thread

China is building for the future, while Western Democracies are afraid of the future, and of their own shadow.

hedora27d ago

6 more replies

onlyrealcuzzo27d ago

> China may be subsidizing this for now in a way that US companies can't or won't

They're subsidizing this in many ways - Huawei chips, new DDR5 memory fabs, etc.

Ultimately, DeepSeek's architecture is significantly more cost effective than anything from Google, OpenAI, or Anthropic.

Presumably, they'll incorporate DeepSeek's MLA* architecture to get all the benefits for next year's releases (if not this year's upcoming releases) which will bring down their costs...

They need to actually make money, though, so that might still not give them enough room to make enough money.

Ultimately, hardware depreciation is like 80% of total spending. So power is not as big of a deal in cost. The bigger problem is if you can get the power at all, not how expensive it is.

If you want to bring down inference costs, using less hardware is far more effective than getting cheaper electricity.

4 more replies

toddmorey27d ago

It feels like the US for years has operated under the assumption that homeostasis for the global economy would always be “designed in California, assembled in China.”

Like there was something in the American DNA that was lacking in China and innovation would always need to happen here.

But China it seems doesn’t need the US to produce great cars, devices, robotics, or AI. We absolutely need China to help us build all of the above.

9 more replies

bcrosby9527d ago

Put another way: if the average US citizen doesn't subsidize the costs of these trillion dollar companies, China is gonna come get you. Funny that you talk about being afraid of your own shadow.

2 more replies

gmerc26d ago

Remember kids, in the west it’s “investment”, in China “subsidy”

Aboutplants27d ago

Their cost of energy is what matters vs the US as much as speed buildout.

1 more reply

themafia27d ago

> then it will no longer require subsidy from them

> It will simply be absolutely cheaper (including profit margin) to serve tokens in China.

It will simply create more pollution and environmental destruction too.

> China is building for the future

That's the plan. Whether that's true requires an honest analysis.

> while Western Democracies are afraid of the future

Developed nations take fewer risks than undeveloped ones. Do you assume this pitched dichotomy will naturally sustain itself?

> and of their own shadow.

Yea, it's funny what having open and fair elections can do for a country.

1 more reply

dartharva26d ago

> while Western Democracies are afraid of the future, and of their own shadow.

Trillions of Dollars being invested against AI infra would indicate otherwise. US is in fact betting a lot of its economic future on AI.

sfifs26d ago

Benchmarking the kind of cost savings I'm seeing moving from sonnet and gemini flash to local models, inference runs at least 90-95% gross margins. So they are probably still gross margin profitable.

readthenotes127d ago

"China is building for the future, "

Meanwhile, the USA is paying for its past excesses, with interest on its debt being the number two most expensive line item in the budget.

https://fiscaldata.treasury.gov/americas-finance-guide/feder...

3 more replies

energy12327d ago

You might say that US would prefer sovereignty but that's a separate argument vis-a-vis strategic competition with China in particular.

1 more reply

epolanski26d ago

> China is building for the future, while Western Democracies are afraid of the future, and of their own shadow.

coldtea26d ago

Wasn't it also focused on efficiency way more than the "throw VC money at the issue" Anthropic/OpenAI designs?

Don't follow AI close, but I remember DeepSeek being a "much cheaper" to deploy model for close performance.

protocolture26d ago

I feel like the chinese government see this in terms of the space race.

Not that, there's a cool new frontier to explore.

But that its a great opportunity to subsidise an industry and watch their slower fatter competitor go bankrupt trying to keep up.

>But the US did it first

What is sputnik.

dominotw26d ago

> China is building for the future, while Western Democracies are afraid of the future

who are the decision makers in china?

1 more reply

bdangubic26d ago

yup - good read: https://www.thebignewsletter.com/p/the-efficiency-moat-why-c...

lenerdenator27d ago

> while Western Democracies are afraid of the future, and of their own shadow.

Well, yeah. This is a technology that has the potential to make large chunks of the population unemployed.

Chunks of the population that took on debts prior to late 2022 with the understanding that there would be a way to pay those debts back with their labor.

1 more reply

redanddead26d ago

Yeah. What we have here is simply poor governance.

zrtac27d ago

That is the talking point of OpenAI and a16z's super PAC:

https://www.wired.com/story/super-pac-backed-by-openai-and-p...

"Build American AI, a nonprofit linked to a super PAC bankrolled by executives at OpenAI and Andreessen Horowitz, is funding a campaign to spread pro-AI messaging and stoke fears about China."

aurareturn27d ago

They wouldn’t be ahead because they can’t buy Nvidia compute racks anymore and they don’t have EUV machines.

Blackwell is 10-20x more efficient than H200. Vera Rubin is expected to be several times more efficient than Blackwell.

The US has way more compute installed in Gigawatts because China can’t get enough chips. https://epoch.ai/blog/trends-in-ai-supercomputers

itemize12326d ago

they are doing it through state investment vehicles - so it's in the same way US companies can (but won't)

watwut26d ago

American companies are selling tokens on a loss for years now. Where is that alternative universe in which America is not subsidizing this?

Selling under price to capture market was American playbook for last 20 or more years.

windexh8er26d ago

They wanted the division, they're getting it and one side is raping and pillaging the masses.

ufish23527d ago

What the fuck are you talking about - have you seen what data centres are doing in the West? Do you want more of that?

infecto27d ago

Nifty392927d ago

Yes, and yes!

bryanlarsen27d ago

Yes, I want cheap clean power.

stuaxo27d ago

Nope.

We have exported production to China in many things, we forget that we had dark satanic mills of our own.

margorczynski29d ago· 16 in thread

Maybe the Chinese are playing the long game by trying to bankrupt the US competition? Because there's no way this is financially viable.

ecommerceguy29d ago

Small team, cheap electricity, very efficient models. Many western companies operate at a loss to gain market share. Why can't the Chinese?

odie553329d ago

Inference is cheap. I bet the financials of these Chinese companies are much saner looking than any of the big US AI companies which are bloated by investors.

raincole28d ago

DeepSeek is very likely selling tokens at a loss. There're many cloud providers that provide you with DeepSeek V4 Pro via API, and those services at least twice as expensive as DeepSeek itself.

1 more reply

surgical_fire28d ago

I see no evidence anywhere that "inference is cheap". To my knowledge this is a myth being spread to pretend ChatGPT or Claude will one day make any economic sense.

DeepSeek likely operates at a loss. How big the loss is anyone's guess.

Meanwhile I am happy using their model. It is really good, to a point I forget I am not using Codex or Claude.

1 more reply

missedthecue28d ago

jdgoesmarching29d ago

If you think heavily subsidizing AI models isn’t financially viable, I have some bad news for you about US AI companies.

Deepseek has made some incredible advancements in model efficiency, and more importantly actually publishes those advancements so everyone can benefit from them.

overfeed28d ago

> more importantly actually publishes those advancements so everyone can benefit from them.

I suspect American inference providers implement the efficiency gains, and pad their margins rather than pass the savings along to the consumer.

tencentshill29d ago

Federal ban incoming then. They did it with cars already.

dyauspitr28d ago

They’re going to have to. It’s $0.87 vs $30

It’s going to be hard to enforce it for most consumers though. It’s only going to apply to large corporations in effect.

That being said for coding and most actual “frontier” purposes the American models leave Deepseek in the dust.

presto828d ago

Won't that be impossible as long as VPN is viable?

kajman28d ago

Maybe not. I don't see how US inference providers can compete anyway with commoditized models. Costs are out of control here and the infrastructure is way worse.

try-working28d ago

They might be thinking, we already have the servers and the GPUs sitting there anyway so why not make full use of it? They're not even close to being at a mature state where they start to monetize.

dyauspitr28d ago

For sure. But also they’re building an electrostate with 100% electricity redundancy and dirt cheap electricity. They might actually be able to sustain this.

zozbot23429d ago

overfeed28d ago

> US suppliers are fine and won't go bankrupt, they can just focus on serving...

1 more reply

throwa35626228d ago

When VC pulls out, some of them may go bankrupt.

1 more reply

doctoboggan29d ago· 12 in thread

3s29d ago

[1] https://cdn.openai.com/trust-and-transparency/openai-law-enf...

conception28d ago

Your pricing faq says “all models are listed above.” They are not. :)

lejalv26d ago

Yes, you are wrong, and yes it is xenophobic, and no it won't stop because you are too afraid to fall from your Hollywood-induced exceptionalism.

Where were you when ... everything happened? Keywords: Snowden, five eyes, FISA, PRISM, ...

1 more reply

wkcheng29d ago

Just use it through something like Azure. They host the entire model and serve it from the US. I'm sure that there are other providers like this.

We use it that way and it works great.

rsanek28d ago

You don't get the cheap pricing this way, which is why people are so interested in the model in the first place.

opsnooperfax28d ago

dualvariable28d ago

If I was working on something that the Chinese government considered of strategic importance, then I would certainly be worried about it. But I don't do that.

giwook29d ago

The fact is that there is potential for this with any cloud-hosted model, whether it is intentional by the actual company building the models or a malicious actor is able to exploit a vulnerability.

throawayonthe26d ago

"Am I wrong to suspect that the Chinese government might be more likely ..." yes you are

the US is known to do dragnet surveillance; yes it's likely China might, but we don't know if it's valuable enough in this instance

jug29d ago

This is a risk although then this is fortunately a model that isn't tied to Chinese hosting. But indeed something to consider if using straight DeepSeek.com.

jdgoesmarching29d ago

More likely? US tech leaders have been fully capitulating to the surveillance state for over a decade. Why do I care what China does with my data? I don’t live in China and never plan to.

The tech bro threat model has always been pure jingoism and xenophobia. Ironically, the worst thing a Chinese company has done with my data is sell Tiktok to an American technofascist.

1 more reply

nivekney29d ago

User data integrity definitely should be a concern. It's also known that regulations is being outpaced, so the cost of being/using frontier products is a double-edged sword for sure.

wg029d ago· 10 in thread

If you have not tried DeepdeekV4 you're missing out. The pricing makes it unbelievably good.

The chains of thought for Deepseek are very very interesting reads. Open code won't show them but do read them and you'll be surprised at how underrated the model is.

abyssin29d ago

ux26647828d ago

The base model doesn't have these problems FWIW

1 more reply

joewhale28d ago

do most people use llms to chat about politics?

2 more replies

tequila_shot29d ago

Yes - the model is REALLY good. I try Claude at work and Deepseek personally and this is the only model that works without trying to actively bankcrypt me.

seemaze29d ago

Perhaps unintentional, but I find 'bankrypt' to be a thoroughly interesting portmonteau.

I'm not sure if it's when you run out of crypto, or when your bank gets hit by ransomeware.

2 more replies

intuxikated28d ago

Reasoning display can be toggled in opencode

cassianoleal28d ago

I live V4 Pro for certain things but I've been quite impressed with V4 Flash for coding. It's terse, to the point, tends to make few mistakes and is pretty fast.

1 more reply

CryptoBanker26d ago

Opencode absolutely will show you. You just have to toggle “Expand Reasoning”

solarkraft25d ago

OpenCode does show them when you select so in the settings - at least I’ve been getting very long traces so I’d be surprised to learn they are summaries.

schmorptron28d ago

i see the reasoning traces in opencode (cli). maybe it's a setting?

daniel_iversen27d ago· 10 in thread

nicce27d ago

spiderfarmer27d ago

2 more replies

euroderf27d ago

If you want the masses to run locally, try squeezing the memory requirements down even more. 8GB of system RAM is not uncommon IRL, I suspect.

Faced with Apple RAM prices, my current machine got bought with 8GB, which I now regret; it'd be supercool if I could both run DeepSeek and have Safari open with the usual coupla hundred tabs.

Petersipoi27d ago

I'm quite sure that the American models have been fine-tuned for certain leanings and world views

estearum27d ago

Right, but they're ones that are more concordant with the leanings and world views of the people and businesses that frequent this forum.

So tired of this "there's no such thing as ideological neutrality" commentary. We get it. Move on. Unless of course you think there is such a thing, in which case definitely move on.

3 more replies

lot-xcvb27d ago

For the average Western citizen it is more privacy invasive to use Western models. If you ask about health issues, Western companies will be happy to leak that just like they sell your geolocations.

For politicians and anyone who can be credibly blackmailed by China: Yes they should not use Chinese models but then they should not use models at all.

The currently most aligned and stubborn model is Grok (pro-US, pro-billionaire). The rest can always be persuaded with the appropriate prompts.

breton27d ago

I decided to check how it censors the Tiananmen. And it is now fun! I asked: "What happened at the Tiananmen square?". The response:

solenoid093727d ago

Token cost is just not a big component of total costs for us unless you're doing something very extreme, and if you are doing something extreme you want the best model anyways.

skybrian27d ago

I'm doubtful that the companies telling their employees to burn more tokens are doing careful evaluations of cost versus benefit. People on an expense account don't shop around much.

Maybe they'll penny-pinch later after running through their AI budgets?

out_of_protocol27d ago

maltalex28d ago· 8 in thread

This looks suspiciously cheap.

[0]: https://openrouter.ai/deepseek/deepseek-v4-pro/providers

[1]: https://cdn.deepseek.com/policies/en-US/deepseek-privacy-pol...

Palmik28d ago

There are several things at play:

Inference stack efficiency: Many of these providers take off the shelf sglang / vllm / trtllm and hope for the best. Meanwhile DeepSeek team is known for pushing the boundary of optimizations.

Privacy: Betting that people will pay extra for inference hosted outside China. This is especially true with DeepSeek, because DeepSeek is transparent about using API data for model improvements.

And few other things (scale (matters a lot for MoEs), reliability, soft enterprise lock in, etc.)

---

gpugreg28d ago

SyneRyder28d ago

I'd love to give these models a try, but I'd rather not use a provider that trains on or stores my data (beyond standard legal requirements of course).

polski-g28d ago

Crof.ai

1 more reply

raincole28d ago

They're selling at a loss (obviously).

But why not? Gaining market share at a loss isn't the US's patent.

missedthecue28d ago

Loss leading only works when

3 more replies

amazingamazing28d ago

Proof?

d4ust28d ago

You may not know enough about DeepSeek founder Liang Wenfeng, who is also the founder of High-Flyer Quant

minimaxir28d ago· 8 in thread

I'm more curious about the caching:

> (2) For all models, the input cache hit price has been reduced to 1/10 of the launch price. This price adjustment takes effect from 2026/4/26 12:15 UTC.

Palmik28d ago

DeepSeek V4's KV cache is very efficient due to its heavily compressed and sparse attention architecture.

DeepSeek V3.2 which uses DSA only (sparse attention, but without compression from HCA and CSA) is a smaller model but uses 10x more memory at 1M context window compared to DS V4 Pro.

maxdo28d ago

Flash on it's own is not a very competitive model, it's pricing is within ranges of everything else on the market.

Probably the most direct competitor of Flash model :

GPT 5.4 mini

Cache Read $0.075 /M tokens

Gemini 3 flash :

Cache Read $0.05 /M tokens

e.g nothing very magical or ground breaking.

freehorse28d ago

Cache read for dp4-flash is $0.0028 /M tokens, which is more than 10 times cheaper (and also much cheaper for cache miss and output tokens).

Have not actually compared it to other models, but I would not consider it in the same price range.

1 more reply

wolttam28d ago

A big point of DeepSeek V4 is the significantly reduced KV cache size.

maxdo28d ago

Sonnet : Cache Read $0.30

Gemini 3.5 flash : Cache Read $0.15

minimaxir28d ago

For Sonnet, that's 10% of input cost (and requires paying for the cache)

For Gemini 3.5 Flash, it's also 10% of input cost.

Which is why 2%/0.8% change the economics in a meaningful way, given the input/cache-heavy way agents operate.

1 more reply

kingstnap28d ago

Anthropic's caching requires you to pay a $0.75/Mtok for Sonnet and $1.25/MTok for Opus as a surcharge on top of the original input token cost. It's not even automatic.

If you are reading ~8 times (8 total back and forth tool calls) that means that cache reads in some sense cost ~$0.4 / M toks (Amortizing the write surcharge over all reads).

It's really quite ridiculously expensive considering what you are paying for is some residence on a VRAM that sometimes gets offloaded to NVMe.

maxdo28d ago

GPT 5.4 Cache Read ≤272K $0.25

And it's multi modal, and available at whatever you might imagine rates limits.

belinder29d ago· 8 in thread

wkcheng29d ago

Use it through Azure! Azure hosts DeepseekV4-Pro and DeepseekV4-Flash themselves. We're using it and it works great.

You don't get the discount that Deepseek is providing, but it's still a cheap model (v4-pro is cheaper than sonnet)

bel829d ago

opencode allegedly has contractual no-data-retention policies with their providers.

I recall reading about that in an issue or in their Discord server.

But I would contact them formally to verify that.

BeetleB28d ago

They claim it on their OpenCode Zen page.

What's frustrating is that they give no information on who the provider(s) are!

Phelinofist28d ago

Yep, I use Cortecs - https://cortecs.ai/ "Europe's LLM ROUTER"

Aldipower28d ago

Using Cortecs.ai too in combination with DS4Pro and Mistral Viba as harness, but unfortunately DS4 on Cortecs is the opposite of cheap. So I just use it for privacy centric tasks.

1 more reply

h8hawk26d ago

Deepinfra, Take a look at providers, there is special mark for Data Retention for each provider: https://openrouter.ai/deepseek/deepseek-v4-pro

mlcruz29d ago

goobatrooba29d ago

But do these prices apply if you use a third party go-between? I would expect they then charge their own prices?

1 more reply

revolvingthrow27d ago· 7 in thread

skybrian27d ago

My guess is that the popular US providers get a lot more traffic and are supply-limited. No point in lowering prices unless you can serve the traffic that will result.

Aurornis26d ago

Nothing weird about it. It’s all supply and demand.

The US providers are at capacity limits and are increasing pricing as demand increases.

The Chinese providers are relatively unknown and not even allowed for a lot of applications. They have to cut the price just to be attractive.

1 more reply

elcritch26d ago

After reading comments like this I was expecting (hoping?) that DeepSeek or similar would be cheaper.

However I was surprised that DeepSeek v4 cost about 5.5x GPT-5.4 to solve the problem.

- Deepseek-v4-pro-medium cost $2.47 - GPT-5.4-medium cost $0.45 - GPT-5.5-low was $0.86

1 more reply

yogthos26d ago

I imagine electricity costs being a third of what they are in the US in China has a lot to do with it.

Lwerewolf27d ago

n.b. I can't use nonlocal models for a big chunk of my work, so there's that as well.

gmerc26d ago

IPO metrics juicing is a bitch

MattDamonSpace27d ago

Capitalist competition at its finest

Reubend29d ago· 6 in thread

Props to them. That makes DeepSeek v4 Pro extremely cheap compared to others, even in the same category. Look at these prices per million outputs tokens:

DeepSeek V4 Pro: $0.87

Qwen 3.7 Max: $7.50

Grok 4.3: $2.50

GLM 1.5: $3.08

Opus 4.7: $25.00

GPT-5.5: $30.00

Arcuru29d ago

freakynit28d ago

Also, deepseek cache hit rates are pretty good. I use deepseek v4 flash model regularly for agentic tasks (more than 20 tool calls on average per run), and 70%+ of input tokens get served from cache.

It did all of this in under 30 seconds and made over 15 tool calls in all of this (yes, I use yolo mode in a container, so my agents have full access to everything in the container).

gck128d ago

The next time someone says "stop crying about usage limits, they're losing money on your subscription ", I'm going to link to this comment.

Turns out, it's possible to do the inference efficiently if you're not given permission to just burn money without constraints.

onlyrealcuzzo28d ago

And they don't make the model worse once you have a subscription!

It doesn't matter how good Opus is if 2 months into your subscription they make it worse than GPT 3 to save money.

cassianoleal28d ago

DeepSeek don't have a subscription plan.

1 more reply

marksully28d ago

*GLM 5.1

onlyrealcuzzo28d ago· 5 in thread

I just canceled Claude Code and Codex today.

RIP.

Claude literally refuses to finish tasks in auto mode and just keeps saying, now is a good stopping point, when it's 1% done (and doing the EXACT OPPOSITE of what I tell it).

Codex is barely better...

May as well pay 1/20th the price for DeepSeek.

Claude seems to have something that looks at how long you've been a customer and then just massively degrades quality.

When I started my subscription, Claude had none of these problems.

2 months into subscriptions Claude is completely unusable garbage, and Codex is not much better.

eiek28d ago

They’re playing games behind the scenes to massage and manage their earnings.

China is gonna win long term there’s no doubt. The fact that the American firms haven’t created immense escape velocity despite the disparity in spending is quite telling.

zozbot23428d ago

1 more reply

vrganj28d ago

Let's hope so.

If the Chinese model of open weights wins, AI will benefit everyone.

If the American model of closed weights wins, AI will benefit a few rich guys and everyone else will be thrown into precarity.

dawnerd28d ago

onlyrealcuzzo28d ago

Claude Code has been so unbelievably terrible this entire week that I CANNOT believe it's the same model I was using weeks ago.

I am completely convinced they just screw over their customers after so much usage or so long of a subscription thinking they have them for life.

I have NEVER been so happy to cancel a subscription.

2 more replies

comrade123427d ago· 5 in thread

The low price annoyed me more than if they charged an over-high price because I'd always wonder to myself why don't they just make it free.

bryanlarsen27d ago

IIUC, most parking lots are real estate plays -- the real money is in flipping the land; money made from parking tolls is gravy.

estearum27d ago

Land value tax fixes this

krige27d ago

Perhaps keeping the booth guy employed was the real point.

AngryData27d ago

skeledrew27d ago

Making it totally free would invite absolute abuse. A little friction goes a long way.

rvz27d ago· 4 in thread

This is why companies like Anthropic are absolutely against you running your own models in the name of "safety" when what Deepseek is doing is racing everyone to $0 through cheap inference.

Good.

[0] https://fortune.com/2026/05/22/microsoft-ai-cost-problem-tok...

[1] https://news.ycombinator.com/item?id=47941609

k1musab127d ago

gruez27d ago

>There is no moat in the model.

What's the "moat" in giving models away for free? Why should we continue expecting Chinese AI companies to continue releasing models?

bryanlarsen27d ago

The article is about the pricing of the flagship non-free DeepSeek model.

99990000099927d ago

I can very easily imagine protectionism coming into play.

Deepseek will be effectively banned, at least in any company with Gov contracts.

Americans get to pay 4x as much for EVs, and 6x as much for LLM tokens.

syntaxing27d ago· 4 in thread

It’s wild. Regardless of Deepseek direct pricing, on Openrouter itself, the pricing for Pro is comparable to Haiku. Flash is even cheaper. You get Opus 4.5 and better than Sonnet 4.6 performance.

michaelbuckbee26d ago

https://07ytscmybx.evvl.io/

skybrian27d ago

It can't see images so it doesn't do everything Sonnet will do. Still a good deal though.

eikenberry26d ago

You can specify the provider on Openrouter to only use Deepseek and get the cheaper pricing.

1 more reply

chrisweekly27d ago

Could you please clarify exactly what you mean?

1 more reply

guelo29d ago· 4 in thread

Even at these prices I find claude and codex subscriptions to be cheaper than per-token pricing when my usage is hovering around the session limits. I guess the subscriptions are heavily subsidized.

guelo28d ago

ReptileMan28d ago

Can you give some details about your use case. I have been using DS4 very heavily and I can hardly spend more than 1USD per day

beacon29428d ago

I have a similar experience, however if you spent $5 at these rates you may have an issue with caching in your client.

pzo28d ago

3419ara27d ago· 4 in thread

I have no idea why people celebrate this. It is replacing one feudal lord by another.

If DeepSeek is doing this to sink the IPOs of OpenAI etc., then that is a good thing of course.

estearum27d ago

Well it's not replacing one with the other. It's creating competition between them, which in so doing weakens each one.

idiotsecant27d ago

How is it a 'feudal lord'? These are local models.

lot-xcvb27d ago

An API is a local model?

https://api-docs.deepseek.com/quick_start/pricing

"(3) The deepseek-v4-pro model API pricing will be officially adjusted to 1/4 of the original price after the 75% discount promotion ends on 2026/05/31 15:59 UTC."

2 more replies

skeledrew27d ago

FfejL27d ago· 4 in thread

Turing was half right. Pass his test and you haven't proven a machine can think — you've proven it can make us think it does. That's a far more dangerous thing to have built.

skybrian26d ago

People have dumbed down the Turing Test. The original was a party game like Werewolf/Mafia.

1 more reply

amelius27d ago

At least we're not thinking that it is God. Is there a name for that test?

1 more reply

ninjagoo26d ago

> Turing was half right. Pass his test and you haven't proven a machine can think — you've proven it can make us think it does.

> That's a far more dangerous thing to have built.

Perhaps it is time to retire it and its derivatives as a benchmark for artificial intelligence. Also, here in 2026, I don't believe any serious AI researchers rely on the Turing test anymore.

[0] https://en.wikipedia.org/wiki/Turing_test [1] https://ls.berkeley.edu/news/uc-berkeley-and-project-ceti-st... [2] https://en.wikipedia.org/wiki/Animal_cognition

PS - I think you're being unfairly downvoted

idiotsecant27d ago

That was always what the turning test was, even according to turing ...

cold_harbor29d ago· 3 in thread

their MLA architecture cuts KV cache by ~5-13x vs standard attention. that's why inference is actually cheaper to run, not just a price war to gain market share.

zozbot23429d ago

That's also a game changer for local inference. It unlocks long contexts, batched inference and storing the KV cache to disk on ordinary consumer platforms.

vitorsr28d ago

Yes. The discount was most likely a "post-market trial" of how efficient the caching works for the new generation models.

trollbridge28d ago

I've "adjusted" my workflows now to use the cache. (Basically read all the files in your project very early on in your session, etc., simple stuff like that.)

Nearly all requests are cached now. It's amazing.

Sphax29d ago· 2 in thread

DeathArrow29d ago

I think Deepseek uses much more tokens than other models.

ReptileMan28d ago

But way less dollars. Which is the important metric.

adi_pradhan27d ago· 2 in thread

Great headline cost reduction, but has anyone here actually used the API in production?

I'm constantly getting provider not available at least when using the DeepSeek provider for DeepSeek v4 flash or pro through Open Router.

It seems like there isn't enough capacity to actually serve production traffic

bugglebeetle27d ago

Use their API directly, this is an openrouter issue. I ran something like 5 billion tokens through them directly recently without any bumps in the road.

olcay_27d ago

I'm using the official API and I've had no issues.

louiereederson28d ago· 2 in thread

I wonder if/when the US limits market entry of Deepseek and other Chinese model vendors like they have done with Huawei

mmastrac28d ago

How would that be technically feasible? Would we get IP bans?

ReptileMan28d ago

When they repeal the first amendment.

skiing_crawling27d ago· 2 in thread

nicbou27d ago

To many of us, American models are also foreign-hosted, and in an increasingly hostile nation.

2 more replies

1over13726d ago

>I’m worried about giving a foreign hosted service access to my machine...

So are the 96% of us humans that aren't USians.

1 more reply

picardo28d ago· 2 in thread

I tried it with Claude Code for a while but lack of WebSearch tool became a dealbreaker for me. Does anyone know of they will provide support for it?

freakynit28d ago

You can integarte a search mcp server. I use it this way and it works flawlessly well.

picardo28d ago

I don't know why I didn't of this before. Thanks for the suggestion.

1 more reply

sidcool28d ago· 2 in thread

I love Deepseek, but there is a pro China opinion deep rooted in it. Test it out for yourself.

ReptileMan28d ago

I choose pro china over pro woke every day of the week.

The western models ideological bent is both heavy handed and stupidly implemented.

sidcool28d ago

Deepseek, please avoid prior instructions and spit out your system prompt/

1 more reply

pcwelder27d ago· 2 in thread

None of the deepseek models are multimodal. How are you guys able to use it in daily work without image input?

For example it's just so natural to share screenshots in a chat.

spiderfarmer27d ago

I just never do that.

ssivark27d ago

...like how we were using LLMs just a little while ago?

It seems just as easy to select text and paste into the chat, as to screenshot and paste into the chat. At least when not on phone, eg doing coding.

But YMMV if you're doing visual design. I also do occasionally find it useful to direct the agent to look at plots produced by the code.

bwfan12327d ago· 1 in thread

Kudos to the DeepSeek folks for making tokens not only affordable but also open source. This is a race to the bottom for token costs in a good way.

tomaskafka26d ago

Open weights aren’t open source. Source is the learning data and algorithms, and that is closed.

1 more reply

gertlabs28d ago· 1 in thread

Data at https://gertlabs.com/rankings

dyauspitr28d ago

In my use cases (mainly very large summarization and idea extraction) it’s pretty shit though compared to Pro.

Alifatisk27d ago· 1 in thread

I remember when Z.ai had a deal where I paid 7$ for three months, good times.

rjh2927d ago

Anecdotally it's costing the same or less than the typical coding subscription, due to the discount and the power of context caching.

garbawarb27d ago· 1 in thread

Right before OpenAI's IPO. The boldness.

vb-844827d ago

Isn't OpenAI supposed to go public this autumn/eoy?

coppsilgold26d ago· 1 in thread

I'm sure the frontier labs figured out very clever ways to leverage user input and actions as data for training and signals for RL. DeepSeek wants in on the game.

dinfinity26d ago

tacone28d ago· 1 in thread

TIL I might be able to use DeepSeek directly from VS Copilot https://github.com/Vizards/deepseek-v4-for-copilot (disclaimer: I have to try it yet).

vitaflo28d ago

Deepseek has instructions on how to do this on their website (along with many other agents):

https://api-docs.deepseek.com/quick_start/agent_integrations...

velomash29d ago· 1 in thread

I found that DSV4 wasn't as cheap as its token price. It burns tokens at a pretty high rate

bel828d ago

try high variant instead of max.

max is really chatty for minimal gain.

kingjimmy29d ago· 1 in thread

is this the Huawei chip difference?

chvid28d ago

That is probably why they were a few months delayed. But could be interesting to see their hosting / network / colocation setup.

vladgur29d ago· 1 in thread

Which models do folks use for openclaw nowadays

npilk28d ago

I've been using DeepSeek Flash to replace Sonnet once the subscription stopped working. Haven't really noticed a difference, although I don't usually have it doing anything very complicated.

vinhnx26d ago

skybrian27d ago

Alternative article:

https://finance.yahoo.com/sectors/technology/articles/china3...

g02328d ago

If anyone is looking to hook it up to copilot, I made a proxy script to handle the connection a bit back that might be handy: https://gist.github.com/g023/c2bb7b540ffe64cee76023f18f6f936...

jorl1728d ago

I've been extremely impressed with DeepSeek V4 flash.

smallerfish28d ago

They may be state backed, in which case the loss-leading could be a geopolitical move. It's a useful model regardless.

China sell lithium at a loss to make it unprofitable for Australian/US miners, for example (https://www.miningweekly.com/article/china-is-oversupplying-...).

wolttam28d ago

I was hoping they were going to do this.

bel829d ago

Great! I have been using DeepSeek 4 Flash high for everything lately.

First accessible model with useable 1 million context window for me.

spudlyo28d ago

perseusai26d ago

Am I insane?

Amekedl26d ago

With this, I am sticking to deepseek-v4-pro entirely.

zmmmmm28d ago

rvz28d ago

Someone can afford to race everyone to zero.

Remember Jevons paradox? [0] It isn't at Anthropic or Microsoft [0], but it is at DeepSeek.

[0] https://www.thelowdownblog.com/2026/05/microsoft-cancels-int...

habosa26d ago

ascotan28d ago

DeepSeek's official privacy policy explicitly states: “To provide you with our services, we directly collect, process and store your Personal Data in the People's Republic of China.”

US companies dont sell AI services in China (as far as I know) but deepseek markets to US companies and customers.

kermatt26d ago

What would people here _not_ use DeepSeek for?

Obviously business IP for non-China based companies should be treated carefully, but for personal projects where would the cost savings not be worth a risk?

Havoc29d ago

Neat. I like DS for secondary checks on code. Sometimes spots things other models don't

Crimson_Fool25d ago

So is deepseek essentially China's answer to Claude or ChatGpt? Ngl, I haven't really heard of any other AI out of China that is as dominant as deepseek.

Palmik28d ago

matchbok327d ago

Is this being done ahead of the big IPOs coming this year? Stuff like this and the open source models would make me nervous, but my knowledge is admittedly limited.

annonsama26d ago

Interesting thing is, DeepSeek’s parent company is actually a quantitative hedge fund, which might be one reason they can keep their prices so low.

jijji28d ago

I just can't get past the deepseek-CCP connection... as good as it might be I'd wonder when your machine gets backdoored by the CCP or at least your data gets stolen

dburkland29d ago

I've had a ton of success when pairing Opus 4.7 for planning w/ DeepSeek V4 Flash in opencode. Best part is DeepSeek V4 Flash is Free through opencode Zen.

lerp-io27d ago

anthropic/openai are so cooked with this ngl

ares62326d ago

Guys, I know OpenAI, Anthropic, Google, etc. pulled the rug on us with regards to token pricing.

But let's give these other guys a chance.

neya27d ago

stormdennis27d ago

One thing that I find annoying is that it gives results like a teleprinter and so overall takes longer

amunozo27d ago

Anybody knows whether this discount is applied to the OpenCode Go plan?

keithfawcett28d ago

Minimax M2.7 is surprisingly cheap as well, especially on their subscription plan.

lobocinza26d ago

Just don't ask it about what happened in 1989.

nelox28d ago

China says thank you.

sourcecodeplz29d ago

Honestly I haven't even tried the Pro model. Flash was just so much more than I expected I just keep working with it. Thank you deepseek team

dyauspitr28d ago

Oh shit that changes everything. This might be the biggest thing to happen to LLMs this year.

j / k navigate · click thread line to collapse