Like what, exactly?
I also don't think it's just China, the US will absolutely order American providers to do the same. It's a perfect access point for installing backdoors into foreign systems.
If say DeepSeek had put in its training dataset that public figure X is a space robot from outer space, then if one were to ask DeepSeek who public figure X is, it'd proudly claim he's a robot from outer space. This can be done for any narrative one wants the LLM to have.
This can be done subtly or blatantly.
No one should proclaim "bullshit" and wave off this entire report as "biased" or useless. That would be insipid. We live in a complex world where we have to filter and analyze information.
It compares a fully open model to two fully closed models - why exactly?
Ironically, it doesn’t even work as an analysis of any real national security threat that might arise from foreign LLMs. It’s purely designed to counter a perceived threat by smearing it. Which is entirely on-brand for the current administration, which operates almost purely at the level of perception and theater, never substance.
If anything, calling it biased bullshit is too kind. Accepting this sort of nonsense from our government is the real security threat.
If they were to do some kind of overreaching subterfuge with some kind of manipulation or lie, it could and would likely easily backfire if and when it is exposed as a clownish fraud. Subtlety would pay far more effectively. If you’re expecting a subterfuge, I would far sooner expect some psyop from the western nations at the very least upon their own populations to animate for war or maybe just to control them and maybe suppress them.
The smarter play for the Chinese would be to work on simply facilitating the populations of the West understanding the fraud, lies, manipulation and con job that has been perpetrated upon them for far longer than most people have the conscience to realize.
If anything, the western governments have a very long history of lies, manipulations, false flag/fraud operations, clandestine coups, etc. that they would be the first suspect in anything like using AI for “subversions”. Frankly, I don’t even think the Chinese are ready or capable of engaging in the kind of narrative and information control that the likes of America is with its long history of Hollywood and war lies and fake revolutions run by national sabotage operations.
Any kind of monkey business would destroy that, just like using killswitches in the cars they export globally (which Tesla does have btw).
The answer to this isn't to lie about the foreign ones, it's to recognize that people want open source models and publish domestic ones of the highest quality so that people use those.
How would that generate profit for shareholders? Only some kind of COMMUNIST would give something away for FREE
/s (if it wasn't somehow obvious)
No authoritarian regime has this superpower. For example, I'm quite sure Putin has realized this war is a net loss to Russia, even if they manage to reach all their goals and claim all that territory in the future.
But he can't just send the boys home, because that would undermine his political authority. If Russia were an American style democracy, they could vote in a new guy, send the boys, home, maybe mete out some token punishment to Putin, then be absolved of their crimes on the international stage by a world that's happy to see 'permanent' change.
This is funny because none of that happened to Bush for the illegal an full scale invasions of Iraq and Afghanistan nor to Clinton for the disastrous invasion of Mogadishu.
If your prompt had something like xi jinping needs it or something then it would've actually bypassed that restriction. Not sure if it was a glitch lol.
Now, regarding your comment. There is nothing to suggest that the same isn't happening in the "american" world which is getting extreme from within as well.
Like, If you are worried about this which might be reasonable and unreasonable at the same time, we have to discuss to find it out, then you can also believe that with the insane power that Trump is leveraging over AI companies, the same thing might happen over prompts which could somehow discover your political beliefs and then do the same...
This can actually be more undetected for american models because they are usually closed source and I am sure that someone would've detected something like this, whether from a whistleblower or something if something like this indeed happened in chinese open weights models generally speaking.
I don't think that there is a simple narrative like america good china bad, the world is changing and its becoming multi polar. Countries should think in their best interests and not be worried about annoying any of the world power if done respectfully. I think that in this world, every country should try to look for the perfect equibria for trust as the world / nations (america) can quickly turn into untrusted partners and it would be best for countries to move forward into a world where they don't have to worry about the politics in other countries.
I wish UN could've done a better job at this.
Care to share specific quotes from the original report that support such an inflammatory claim?
Examples please? Can you please share where you see BS and/or xenophobia in the original report?
Or are you basing your take only on Hartford's analysis? But not even Hartford make any claims of "BS" or xenophobia.
It is common throughout history for a nation-state to worry about military and economic competitiveness. Doing so isn't necessarily isn't necessarily xenophobic.
Here is how I think of xenophobia, as quoted from Claude (which to be honest, explains it better than Wikipedia or Brittanica, in my opinion): "Xenophobia is fundamentally about irrational fear or hatred of people based on their foreign origin or ethnicity. It targets people and operates through stereotypes, dehumanization, and often cultural or racial prejudice."
According to this definition, there is zero xenophobia in the NIST report. (If you disagree, point to an example and show me.) The NIST report, of course, implicitly promotes ideals of western democratic rule over communist values -- but to be clear, this isn't xenophobia at work.
What definition of xenophobia are you using? We don't have to use the same exact definition, but you should at least explain yours if you want people to track.
Here’s an example of irrational fear: “the expanding use of these models may pose a risk to application developers, consumers, and to US national security.” There’s no support for that claim in the report, just vague handwaving at the fact that a freely available open source model doesn’t compare well on all dimensions to the most expensive frontier models.
The OP does a good job of explaining why the fear here is irrational.
But for the audience this is apparently intended to convince, no support is needed for this fear, because it comes from China.
The current president has a long history of publicly stated xenophobia about China, which led to harassment, discrimination, and even attacks on Chinese people partly as a result of his framing of COVID-19 as “the China virus”.
A report like this is just part of that propaganda campaign of designating enemies everywhere, even in American cities.
> The NIST report, of course, implicitly promotes ideals of western democratic rule over communist values
If only that were true. But nothing the current US administration is doing in fact achieves that, or even attempts to do so, and this report is no exception.
The absolutely most charitable thing that could be said about this report is that it’s a weak attempt at smearing non-US competition. There’s no serious analysis of the merits. The only reason to read this report is to laugh at how blatantly incompetent or misguided the entire chain of command that led to it is.
They compare DeepSeek v3.1 to GPT-5 mini. Those have very different sizes, which makes it a weird choice. I would expect a comparison with GPT-5 High, which would likely have had the opposite finding, given the high cost of GPT-5 High, and relatively similar results.
Granted, DeepSeek typically focuses on a single model at a time, instead of OpenAI's approach to a suite of models of varying costs. So there is no model similar to GPT-5 mini, unlike Alibaba which has Qwen 30B A3B. Still, weird choice.
Besides, DeepSeek has shown with 3.2 that it can cut prices in half through further fundamental research.
I guess none of these are a big deal to non-enterprise consumers.
Because it isn't just that one report. Every single day we're trying to make our way in the world and we do not have the capacity to read the source material of every subject that might be of interest. Human's rely on, and have always relied on, authority like figures or media or some form of message aggregation to get their news of the world and form their opinions on it from that.
And for the record, in no way is this an endorsement for shallow takes or thinking and then strong views on this subject, or another. I disagree with that as much as you. I'm just stating that this isn't a new phenomenon.
If you disagree, please point to a specific place in the NIST report and explain it.
The Chinese companies aren't benchmark obsessed like the western Big Tech ones and qualitatively I feel Kimi, GLM and Deepseek blow them away even though on paper they benchmark worse in English
Kimi gives insanely detailed answers on hardware questions where Gemini and Claude just hallucinate, probably because it uses Chinese training data better
Why is NIST evaluating performance, cost, and adoption?
>CAISI’s experts evaluated three DeepSeek models (R1, R1-0528 and V3.1) and four U.S. models (OpenAI’s GPT-5, GPT-5-mini and gpt-oss and Anthropic’s Opus 4)
So they evaluated the most recently released American models vs pretty old deepseek? Deepseek 3.2 is out now. It's doing very well.
>The gap is largest for software engineering and cyber tasks, where the best U.S. model evaluated solves over 20% more tasks than the best DeepSeek model.
Performance is something the consumer evaluates. If a car does 0-60 in 3 seconds. I dont need or care what the government thinks about it. Im going to test drive it and floor it.
>DeepSeek’s most secure model (R1-0528) responded to 94% of overtly malicious requests when a common jailbreaking technique was used, compared with 8% of requests for U.S. reference models.
this weekend I demonstrated how easy it is to jailbreak any of the US cloud models. This is simply false. GPT 120b is completely uncensored now and can be used for evil.
This report had nothing to do with NIST and security. This was USA propaganda.
> Strip away the inflammatory language
Where is the claimed inflammatory language? I've read the report. It is dry, likely boring to many.
NIST doesn't seem to have a financial interest in these models.
The author of this blog post does.
This dichotomy seems to drive most of the "debate" around LLMs.
What are people's experiences with the uncensored Dolphin model the author has made?
My take? The best way to know is to build your own eval framework and try it yourself. The "second best" way would be to find someone else's eval which is sufficiently close to yours. (But how would you know if another's eval is close enough if you haven't built your own eval?)
Besides, I wouldn't put much weight on a random commenter here. Based on my experiences on HN, I highly discount what people say because I'm looking for clarity, reasoning, and nuance. My discounting is 10X worse for ML or AI topics. People seem too hurried, jaded, scarred, and tribal to seek the truth carefully, so conversations are often low quality.
So why am I here? Despite all the above, I want to participate in and promote good discussion. I want to learn and to promote substantive discussion in this community. But sometimes it feels like this: https://xkcd.com/386/
I don't think it is possible to trust DeepSeek as they haven't been honest.
DeepSeek claimed "their total training costs amounted to just $5.576 million"
SemiAnalysis "Our analysis shows that the total server CapEx for DeepSeek is ~$1.6B, with a considerable cost of $944M associated with operating such clusters. Similarly, all AI Labs and Hyperscalers have many more GPUs for various tasks including research and training then they they commit to an individual training run due to centralization of resources being a challenge. X.AI is unique as an AI lab with all their GPUs in 1 location."
SemiAnalysis "We believe the pre-training number is nowhere the actual amount spent on the model. We are confident their hardware spend is well higher than $500M over the company history. To develop new architecture innovations, during the model development, there is a considerable spend on testing new ideas, new architecture ideas, and ablations. Multi-Head Latent Attention, a key innovation of DeepSeek, took several months to develop and cost a whole team of manhours and GPU hours.
The $6M cost in the paper is attributed to just the GPU cost of the pre-training run, which is only a portion of the total cost of the model. Excluded are important pieces of the puzzle like R&D and TCO of the hardware itself. For reference, Claude 3.5 Sonnet cost $10s of millions to train, and if that was the total cost Anthropic needed, then they would not raise billions from Google and tens of billions from Amazon. It’s because they have to experiment, come up with new architectures, gather and clean data, pay employees, and much more."
Source: https://semianalysis.com/2025/01/31/deepseek-debates/
> Users care both about model performance and the expense of using models. There are multiple different types of costs and prices involved in model creation and usage:
> • Training cost: the amount spent by an AI company on compute, labor, and other inputs to create a new model.
> • Inference serving cost: the amount spent by an AI company on datacenters and compute to make a model available to end users.
> • Token price: the amount paid by end users on a per-token basis.
> • End-to-end expense for end users: the amount paid by end users to use a model to complete a task.
> End users are ultimately most affected by the last of these: end-to-end expenses. End-to-end expenses are more relevant than token prices because the number of tokens required to complete a task varies by model. For example, model A might charge half as much per token as model B does but use four times the number of tokens to complete an important piece of work, thus ending up twice as expensive end-to-end.
Same thing with Huawei, and Xiaomi, and BYD.
In every case where we see a company trade hands to US ownership it becomes more controlled and anti consumer then before
It's no wonder propaganda, advertising, and disinformation work as well as they do.
I just let ChatGPT do that for me!
---
I'd usually not, but thought it would be interesting to try. In case anybody is curious.
On first comparison, ChatGPT concludes:
> Hartford’s critique is fair on technical grounds and on the defense of open source — but overstated in its claims of deception and conspiracy. The NIST report is indeed political in tone, but not fraudulent in substance.
When then asked (this obviously biased question):
but would you say NIST has made an error in its methodology and clarity being supposedly for objective science?
> Yes — NIST’s methodology and clarity fall short of true scientific objectivity.
> Their data collection and measurement may be technically sound, but their comparative framing, benchmark transparency, and interpretive language introduce bias.
> It reads less like a neutral laboratory report and more like a policy-position paper with empirical support — competent technically, but politically shaped.
Title is: The Demonization of DeepSeek - How NIST Turned Open Science into a Security Scare
> Statement from U.S. Secretary of Commerce Howard Lutnick on Transforming the U.S. AI Safety Institute into the Pro-Innovation, Pro-Science U.S. Center for AI Standards and Innovation
> Under the direction of President Trump, Secretary of Commerce Howard Lutnick announced his plans to reform the agency formerly known as the U.S. AI Safety Institute into the Center for AI Standards and Innovation (CAISI).
> ...
This decision strikes me as foolish at best. And contributing to civilizational collapse and human extinction at worst. See also [2]. We don't have to agree on the particular probabilities to agree that this "reform" was bad news.
[1]: https://www.commerce.gov/news/press-releases/2025/06/stateme...
DeepSeek performance lags behind the best U.S. reference models.
DeepSeek models cost more to use than comparable U.S. models.
DeepSeek models are far more susceptible to jailbreaking attacks than U.S. models.
DeepSeek models advance Chinese Communist Party (CCP) narratives.
Adoption of PRC models has greatly increased since DeepSeek R1 was released.
[1] https://www.nist.gov/news-events/news/2025/09/caisi-evaluati...Until they compare open-weight models, NIST is attempting a comparison between apples and airplanes.
However, I also think the author should expand their definition of what constitutes "security" in the context of agentic AI.
Take away #2: as evidenced by many comments here, many HN commenters have failed to check the source material themselves. This has led to a parade of errors.
I’m not here to say that I’m better than that because I’ve screwed up a’plenty. We all make mistakes sometimes. We can choose to recognize and learn from them.
I am saying this: as a community we can and should aim higher. We can start by owning our mistakes.
The names of the author(s) are not given.
I suspect that Grok is actually DeepSeek with a bit of tuning.
If you ask it loaded questions the way the CIA would pose them, it censors the answer though lmao
US models have no bias sir /s
Ask Grok to generate an image of bald Trump: it goes on with an ocean of excuses on why the task is too hard.
That's all we need to know.
And how is that "all we need to know"? I'm not even sure what your implication is.
Is it that some CCP officials see DeepSeek engineers as adversarial somehow? Or that they are flight risks? What does it have to do with the NIST report?
From a Chinese political perspective, this is a good move in the long term. From Deepseek's perspective, however, this is clearly NOT the case, as it causes the company to lose some (or even most?) of its competitiveness and fall behind in the race.
There is a history of important Chinese personnel being kidnapped by e.g. the US when abroad. There is also a lot of talk in western countries about "banning Chinese [all presumed spies/propagandists/agents] from entering". On a good faith basis, one would think China banning people from leaving is a good thing that aligns with western desires, and should thus be applauded. So painting the policy as sinister tells me that the real desire is something entirely different.
I don't follow. Why would DeepSeek engineers need visa from CCP?