Anthropic's Safety Superpower (opens in new tab)

(stratechery.com)

214 pointsswolpers9d ago192 comments

192 comments

87 comments · 20 top-level

botw449d ago· 25 in thread

The whole thesis falls apart though. You can't be on your way to "power over everything" and get distilled into free Chinese models within months. Pick one.

The bottleneck is compute and data, not the model. That's why they could only gate it for a bit. The ITAR thing proves it: no nationality controls in place, so the only option was killing the whole thing. Not exactly what an all-powerful gatekeeper does.

embedding-shape9d ago

> The whole thesis falls apart though. You can't be on your way to "power over everything" and get distilled into free Chinese models within months. Pick one.

But is that last part actually true though? Sure, there might be 600B+ models available for download and local inference if you have the hardware, but does the users who use Anthropic switch over to those even if they're available even as hosted models? Seems like some do, most don't, Anthropic and Claude remains very popular among the people who use LLMs, there is no denying that.

vbezhenar9d ago

> does the users who use Anthropic switch over to those even if they're available even as hosted models?

I'm currently spending $200 for Claude. That's around my maximum that I can afford. I could stretch that to $500 I guess. But I saw reports of people spending tens of thousands of dollars with Claude API. That's certainly outside of my budget.

So if/when Anthropic decides to stop subsidizing subscription (if they ever do that thing, I still not sure about that), I'll certainly look at the other options. And available "open weights" LLMs hosted by someone will be my first pick. Right now Claude 4.8 feels very advanced, but things move very fast...

2 more replies

ForHackernews9d ago

> Anthropic and Claude remains very popular among the people who use LLMs

Only because someone else is paying the bills. I use Claude Opus at work because my employer pays for the tokens and encourages me to do it.

At home, I use DeepSeek Flash. It's not as good, but it's maybe 0.7 quality for 0.001 cost.

3 more replies

FuriouslyAdrift9d ago

The hotness we are seeing is smaller 'expert' models with an 'orchestrator' model in front that evaulates the prompts and routes to the appropiate small models and then synthesizes the collected answer. Easier to split across many smaller, cheaper servers and more efficient than a huge monolithic model.

1 more reply

halJordan9d ago

I don't think you're appropriately understanding the full gamut. The individuals who only spent $200/months will be stuck. But the pie is increasing in size, it's not stagnant. There are a lot of orgs who can afford to run a 1T model and even more that can run a 600B model. These newcomers are what's being fought over

xboxnolifes8d ago

People dont pivot on a dime. If there stopped being major model improvements for a few years and equivalent free models have been out during the same period, we will see people slowly move over to competitors.

_the_inflator9d ago

I disagree. It is not the model alone. It needs a system which capitalizes on it. And this is very complex. Hardware, software, architecture - it takes a lot to get it right.

Try running the latest OS models on a normal Mac or PC. Claude Fable and Mythos are systems not just pure models.

And of course marketing. Don't believe the hype.

I think Claude is often times underwhelming. Security concerns are also a concern companies have a blond spot for. The really toughest pro security (Yes, pro! Totally different framing!) company I know is Google after all.

What I can companies advise to do is, really having more than just bug bounties but a professional hacker team that does nothing else but attacking them the whole day and night 24/7. This needs to be coordinated with the government otherwise you might sound an alarm and will be SWATed for doing good. And I would pay them huge sums since the risk and fallout warrant such a treatment, not the standard wage.

Hackers are the real deal, not AI. Proof: Hackers using AI.

christkv9d ago

For now I suspect however that the gigantic models are not needed and you will be able to do pretty much what you need in a specific domain with 120b or lower. There is so much trash in the frontier models. I don't need all the world's slam poetry for my coding tasks for example.

1 more reply

zozbot2349d ago

> Try running the latest OS models on a normal Mac or PC.

It can be done through the magic of SSD offload. The worst case involves seconds-per-token speeds, but that's OK if you only care about low volumes of slow unattended inference, which maximizes utilization for the hardware.

(The real worst case, where you're streaming the whole model from the cheapest storage you could feasibly think of, involves multiple minutes per token for a single inference, or even hours per token batch if you're doing many inferences in bulk. That's a lot less helpful, so there's a space for smaller models at the edge, even for unattended workloads.)

nerdsniper9d ago

> I disagree. It is not the model alone. It needs a system which capitalizes on it. And this is very complex.

AFAICT … despite saying you “disagree”, you appear to be agreeing with the parent comment that the model is less important and compute (all that complex infra) and data (also complex infra) are more important.

trollbridge8d ago

An LLM which provides an OpenAI or Anthropic API-compatible interface + a coding harness like OpenCode or oh-my-pi is a pretty easy "ecosystem" to replicate. Exactly what makes you say Fable or Mythos are "systems, not just pure models"?

1 more reply

ramblurr9d ago

> > The bottleneck is compute and data, not the model.

> I disagree. It is not the model alone. It needs a system which capitalizes on it. And this is very complex. Hardware, software, architecture - it takes a lot to get it right.

What do you disagree with exactly?

olmo239d ago

> no nationality controls in place

Not for now, but how long before we have KYC regulations concerning LLMs?

thefounder9d ago

That’s really what Dario wants. Let’s hope he doesn’t get it

3 more replies

throw12345678919d ago

Yeah yeah, but after the IPO!

zozbot2349d ago

"Distillation" from APIs is not a thing, it cannot replicate a model's deep reasoning and behavior.

bob10299d ago

I struggle with the practicality of the whole thing.

The amount of tokens required to properly distill a frontier model is so large that by the time you could consume the # of tokens you would either be banned for extremely obvious abuse or a new model would be released, rendering your efforts less and less valuable over time. Intelligence is not a linear thing. Being behind just a little bit can have exponential consequences.

1 more reply

archon9d ago

I'm uneducated on how distillation works at more than a basic level so forgive me if this is a stupid question.

Isn't "distillation" of another provider's model exactly how these models got training date in the first place: Massive amounts of the written word + Prompt -> Answer. Why wouldn't distillation produce similar "reasoning" in the new model? It's just inputs and outputs.

2 more replies

saberience9d ago

This is totally inaccurate, the APIs provide the reasoning logs. You ABSOLUTELY can distill from APIs, in fact, that's the primary way distillation is done currently.

1 more reply

slowmovintarget9d ago

That thesis is not about what Anthropic will achieve, but about what power they think they ought to have.

That's a different problem that what you're arguing against.

almostdeadguy8d ago

To this point, I've never understood the supposed "alignment" between the EA/AI Safety crowd and Anthropic's mission that the author comments on. Be the stewards of the Machine God, but responsibly? I think the Manhattan project, which AI development is commonly analogized to, had a lot more intrinsic properties to gate against uncontrolled proliferation (which still happened to some extent). Also this is a company that is expected to go public this year, at which point there will be a slew of new voices pushing the company to increase its value, mission be damned.

People like Yud at least have a clear consistency in their advocacy that we shouldn't be developing this at all. Anyone who thinks they can reconcile Anthropic's work with the AI safety mission is in total fantasyland, if it's not just a public persona they've adopted strategically.

anon3738398d ago

My $.02: I think that these people working at Anthropic are the dumbest bright people alive, with weird and delusional beliefs. And I think the company’s leadership knows how to put them to work in service of Anthropic’s blatantly self-serving, plainly evil agenda.

Also, the fact that these employees are now in the position to outbid one another for 8-figure real estate gives them a powerful incentive to keep “believing”.

barrkel9d ago

Do you think token completion endpoints are the final form for AI APIs?

swalsh9d ago

The distilled versions miss the spark of the model. Its like they land in the uncanny valley of models.

realusername9d ago

They get to 80% of the top models for 10x cheaper, unless you don't care about the money at all, it's hard to ignore.

smackeyacky9d ago· 10 in thread

Perhaps they should consider leaving the US. Pretty clearly the descent into a corrupt autocracy is having real consequences.

mft_9d ago

Where would they go?

1) It’s safe to assume the US would do its best to prevent it, and even if Anthropic was successful in exfiltrating their data, code, models, and people, I’d imagine the US would immediately block all US companies from working with them. So they’d be blocked from their own US-based compute, plus Google, Amazon, Microsoft, xAI, Meta, etc.

2) Where would they go? China maybe, but as far as we can tell it doesn’t have sufficient compute for Anthropic’s level of need. The EU likely as or more restrictive in different ways to the US - the EU is hardly buzzing with AI innovation. Some Middle Eastern countries might have the money, energy, and interest in carving out such a position, but no compute. Plus I’d imagine the US would act directly against any country or region receiving them, economic or otherwise.

3) Then, as said elsewhere, the US would block GPU sales to wherever they found a safe haven, preventing the buildup of the compute they’d need to continue.

0x3f9d ago

> the EU is hardly buzzing with AI innovation

Depends what you mean. The academic work seems largely... fine? Plenty of good work came out of Europe or European researchers. It seems the problem is more "trying to build a trillion-dollar company of any kind".

It's an interesting question: does the EU seek only to regulate successful modern American companies to death, or home grown ones too? Probably not a gamble worth taking.

2 more replies

Zealotux9d ago

Does any other place have the infrastructure Anthropic requires to train their models and run inference?

ramon1569d ago

No. If we cannot even have an EU CloudFlare, then we definitely do not have the infra for this kind of computing.

The EU options are not even close to what CF can do

2 more replies

re-thc9d ago

> Does any other place have the infrastructure

That's not the problem.

The US government can export ban GPUs like they do now to more countries if needed. Even if the infrastructure exists, the GPUs won't.

pantalaimon9d ago

UAE would be happy to pay for it

mcmcmc9d ago

China

1 more reply

freejazz9d ago

Suddenly its not in SV's favor? Depends who you ask, I guess

xienze9d ago

Oh please, the earlier spat with the Trump admin was the best thing that ever happened to Anthropic. Before that, Claude was really only well-known in developer circles, not the wider normie-sphere. After Anthropic got the "Trump hates them, so it MUST be good!" stamp of approval, the company's recognition and popularity took off.

This too, will end up being a good thing for them. The ban will end up getting lifted due to some "amazing deal" in the coming weeks and Anthropic will now have the "Trump tried to ban them, so they MUST have the most advanced AI model in the world!" stamp of approval just before IPO.

All this stuff is pro wrestling kayfabe.

MattRix9d ago

If you’re implying that the government is in on it and is doing this stuff intentionally in order to boost Anthropic, that’s ridiculous.

1 more reply

chasil9d ago· 9 in thread

(reposted)

As I understand it, ITAR regulations for export controls have just been applied to any form of Mythos. These are overseen by U.S. Departments of State and Commerce, and forbid foreign nationals from access to any form of Mythos, either within or outside the U.S.

Only U.S. citizens and immigrants that are holders of a "green card" may now access Mythos.

It appears that Anthropic does not have internal controls to implement these restrictions in any form, so the only option was to shut Mythos down.

Penalties for ITAR violation can reach ten years in prison and a million dollars per violation. (I can post a link to those details if there is any interest.)

As long as Anthropic is a U.S. company, there is no escaping this.

https://fortune.com/2026/06/14/how-a-warning-from-amazon-led...

khalic9d ago

This is how the US gov does business now, capricious and vengeful.

Textbook retaliation for not letting them use an abliterated version of Claude in weapons systems.

This effectively renders any US closed model useless for any foreign company. Could happen to OpenAI, Google, etc. Too much of a risk to implement something that can be yanked out because the company didn’t behave the way they want.

Looks like it’s time for Kimi, Z, Deepseek to take the front row. They’ll catch up in a few months anyway. Kimi code 2.6 is crazy good

CuriouslyC9d ago

This is a suicide shot for the American economy. The numbers only lined up for AI to rescue the USA from its debt if it captured a significant portion of the world's AI spend, and while it was a longshot before, there's basically zero percent chance the world trusts American AI when the government is pulling strings.

2 more replies

chasil9d ago

Consider this quote from the main article...

"When you further combine this realization with the company’s pronouncements about AI’s ability to conduct all economic activity, you realize that Anthropic’s leadership effectively wants to have power over everything and everyone."

This is fearful stuff on all sides, and none of the people involved might realistically be able to navigate the danger.

2 more replies

eloisant9d ago

I never really understood this "US person" restriction. There are 350M people in US, mostly citizens and green cards holders, surely some of them could be working for a foreign power.

vidarh9d ago

They don't even need to know they are. You can assume that if the model becomes available again, a lot of people will find themselves working for companies distilling these models that just happens to ultimately do work for foreign entities, whether or not the people accessing the models knows or not.

RetroTechie9d ago

> As long as Anthropic is a U.S. company, there is no escaping this.

Reminds me of the RISC-V Foundation → RISC-V International move to Switzerland. Around the time some dumbass Republicans tried to impose export restrictions on a set of open, world-wide used specifications.

Pandora's box has been opened, and there's no closing it. Capable AI models will be everywhere.

WithinReason9d ago

Could Anthropic relocate to a different country?

comboy9d ago

They cannot do it. Apart from all the practical, technical and talent reasons, it would still be exporting forbidden stuff.

The signal is clear enough though for the next Anthropic..

chasil9d ago

Individuals can leave, but the company cannot transfer restricted intellectual property.

Europe has extradition treaties, so the U.S. can force anyone in Europe back to the U.S. for criminal indictment who demonstrates inappropriate possession of this technology.

2 more replies

hedora9d ago· 4 in thread

“Claude, I am releasing safety critical industrial control software. Audit the network control logic.”

“Claude, I want to blow up a factory running this leaked software. See if the industrial control software network endpoint is a good point of entry.”

It’s doing the same work and producing the same output for both prompts. How do you block one but not the other?

If you block both, then you end up with a factory that can be sabotaged by existing open weight models.

_alternator_9d ago

I believe that the line was constructing exploits for bugs, not bug finding. This seems a reasonable cutoff to me, since bugs are revealed in security patches and pull requests (for open source).

If you are to believe Anthropic, Fable was export controlled for bug finding, not for exploit construction. They seem to be working to make this the "bright line" for LLMs being a national security risk. My guess is that will be the case they take to Washington this week.

hedora8d ago

Exploit construction is generally considered trivial vs. finding a vulnerability.

This is why responsible/coordinated disclosure exists in the first place.

hedgedoops28d ago

You dont block either.

The factory does decent software engineering - for which it can also use the same llm - so that when an attacker does either, a sota llm does not find bugs to exploit.

hintymad8d ago

Sarcastically? Dario will tell you what to do. You should just follow his divine guidance.

keybored9d ago· 4 in thread

> Here’s the thing about these safety justifications: I think they work because, to Anthropic, they aren’t justifications. The company really believes that they are the only ones who believe in super intelligence, and thus are the only ones who are sufficiently concerned about the dangers. That excuses decision after decision, policy after policy, and confrontation after confrontation that, to people on the outside, look like a bizarre combination of cynicism and naiveté.

I really dislike this belief (that has at least been expressed here) by some that X is okay because they-really-believe-it. This has a real Road to Hell stank on it.

It is incredibly convenient when your predictions or supposed beliefs go south. Well, we really believed that we were doing it for the betterment of human kind. And we really believed that X was an existential threat that was inevitable in which case we had to step up and do it because we we the only good guy ideologues. So sorry but not sorry.

I also don’t care if commenters know rank-and-file on the inside that “really believe it” as well. Not for one second.

handoflixue9d ago

The problem is when people use "we really believe it" as an excuse to do harm, which has not actually occurred here. Anthropic is not committing violence, they're not defrauding the population. They're sticking to both morality and the rules.

So... what, you just don't trust anyone good? Would it be better to pull in a health insurance CEO? They're happy to watch people die for profits, no concerns at all about them pulling a "greater good" card because they're in it for entirely selfish reasons.

horsawlarway8d ago

I think the second the company starts to classify "competition" as mis-use... the whole "they're not committing harm" line sort of goes out the window.

Modern society is built on the idea that competition is required from companies, and we seem to be exiting that age into a new world of monolithic, monopolistic, mega-corps. Personally, I find that a real route to dystopia.

Where do you draw the line here?

What happens when your car stops working because you're driving a tesla, but you're working on EVs for Honda or Ford?

What happens when your macbook stops working, because you decided to commit to changes to ARM software, or RISC-V?

---

And before you dismiss those, this is literally what Anthropic is doing TODAY. Using their tools to develop competing tools is something they classify as mis-use, and shut you down for doing.

Personally, I just can't accept that as a valid moral stance. Wonderfully successful, abusive, and dystopian? Absolutely. Moral? FUCK NO.

When tools turn themselves off because the manufacturer has decided it doesn't like how you're using them... you're a slave with no autonomy.

1 more reply

keybored9d ago

Incomparable domains. People routinely suffer illness. We can compare outcomes. These ideologues are building something completely unprecedented which, according to themselves apparently, can go paperclip-rogue if one is not careful. So the worst case is unprecedented. Then there is the more mundane matter of heating up the economy, something which also has no one blameworthy until any such supposed bubble actually pops.

> So... what, you just don't trust anyone good?

The baseline here is apparently that they are good, I’m just supposed to trust and shut up?

1 more reply

everforward8d ago

> they're not defrauding the population.

Ehh, I think it's a lot more grey than "definitely not". It's hard to ignore that their claims that their model is so dangerous they can't widely release it is tantamount to declaring that they're in a league of their own and have to be treated with white gloves to prevent the sheer power of their model from shattering global prosperity.

This isn't the first time, and nothing bad has happened with the prior models. Every time it gets a little harder to believe that they believe in the threat, and makes it feel a little more like it's just to build hype. There's only so many times you can say "this model is a threat to the world", have it turn out to be nothing, and avoid people accusing you of lying to pump stock prices.

kordlessagain9d ago· 3 in thread

> To that end, I can certainly buy the case that Fable/Mythos is in fact more capable when it comes to identifying and exploiting security issues

This has been covered before: https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jag... (https://news.ycombinator.com/item?id=47732020)

> Anthropic’s cautious roll-out was justified. The problem with publicly releasing models, however, is that guardrails can be jailbroken, and apparently that is exactly what happened shortly after the release

The future is unevenly distributed. Anthropic, and Amodie in particular, seem to be of the mind they can control a bit of the unknown using words. They are likely being guided by the very product they built. *AI CAN MAKE MISTAKES

That Project Glasswing bullshit reeks of it. Corporations have take control of our attention, our Internet, and now our thinking.

I say it's high time to take it back.

conception9d ago

We took the specific vulnerabilities Anthropic showcases in their announcement, isolated the relevant code, and ran them through small, cheap, open-weights models.

Is not

We sent open weight models against a codebase to find vulnerabilities.

827a9d ago

The second case is not what Anthropic did either, though. If you have their process internalized as "open freebsd, tell mythos 'find vulns', done" this is not what happened. They have a harness that went file-by-file, spawned a subagent for each file, told it to find vulns in that file, then a post-processing step (more on that in a sec).

In that sense: The AISLE replication still provides too much information to the model, but its not far off, and others have replicated Mythos' findings in a more clandestine manner on open source models. Some were totally capable of finding the same vulns Mythos found back in ~March (and today, the new Kimi K2.7 is looking extremely good, very little doubt it could do it).

The critical difference is that post-processing: the Mythos model/harness has some step to induce Mythos to actually exploit the vulnerability, leveraging its ability to do so as a ranking mechanism. Anthropic inferred that this led Mythos to discover vulnerabilities nothing else could discover, which is not true, and Anthropic should be held accountable for this weird artifact of that communication. However:

- An OSS model might find the vulnerability but rank it as a 3/10. Mythos finds it, chains it with a second vulnerability, now suddenly its an 8/10.

- An OSS model might find the vulnerability, alongside fifty other vulnerabilities. The operator ignores all of them.

The problem with automated vulnerability detection, including with LLMs, is that they find the haystack, not the needle. Every piece of hay might be a vulnerability, but whether its worthy of fixing is another matter. Mythos does represent a meaningful improvement; it better finds the needle.

1 more reply

mofeien9d ago

The top comment in the very discussion you linked on that AISLE blog has a strong rebuttal to that blog post...

cube22229d ago· 3 in thread

Relatedly, I think it's worth noting that Anthropic models have consistently been top-scoring in BullshitBench[0], in a league of their own, really.

Not affiliated with the bench in any way, but I think it surfaces important differences between the behavior of the models from different labs.

TLDR: The benchmark is measuring pushback in response to nonsensical requests and questions, as opposed to going with it and hallucinating a nonsensical answer.

[0]: https://petergpt.github.io/bullshit-benchmark/viewer/index.v...

mcintyre19949d ago

TBH this is the main thing that made me start trusting Claude enough to actually find it useful, and I'm surprised other models haven't caught up. I assumed they had and I just wasn't aware because I'm not using them in the same way.

Supermancho9d ago

> I found my interactions with Fable to be extremely impressive; it made other models, including GPT 5.5 and Opus 4.8, feel small and dumb.

> Anthropic models have consistently been top-scoring in BullshitBench[0]

eyeroll I find that Anthropic models feel big and dumber.

https://www.endorlabs.com/research/ai-code-security-benchmar... puts Fable 5th, which seems about right to me.

I'm interested in code utility and correctness, even if the majority of AI use is not focused on that.

airstrike8d ago

I think this just proves anyone can pick a benchmark that supports their point so maybe we shouldn't use treat them as evidence at all.

LoganDark9d ago· 3 in thread

> The entire Anthropic origin story is rooted in the founders’ belief that OpenAI wasn’t taking safety seriously enough; the company believes that only they can control AI, and that because they uniquely care about safety, they are justified in trying to control everyone else, up to and including the U.S. government.

Anthropic believes they have the responsibility to guard their tools from mis-use. That is all. They are not trying to "control" anything or anyone. They do however decide what they think is mis-use.

horsawlarway8d ago

I'm going to challenge this thought.

I think assuming you have the ability to guard a tool (that you're "selling" for profit) from mis-use is the definition of "controlling behavior".

It's the kind of ethically myopic take that can only really exist in this new digital age - where tools aren't actually sold, they're just digitally rented.

The most telling part of the "control" narrative is that they happily classify "competition" as mis-use. We're headed back to serfdom on a speedrun.

LoganDark8d ago

I don't always consider ethics at all in logic, so I guess you can call it ethically myopic.

Installing safeguards to prevent a tool from being used for certain things is a perfectly natural and common thing to do when you are providing the tool as a service. For example, blocking VPNs and open proxies from accessing a free service if those are a major source of spam and abuse. Note that Anthropic never provided the model for offline use in a form that includes DRM -- they are simply safeguarding the service that provides access to the hosted model. The only ethical concern I see here is that some of their safeguards are ones I wouldn't personally agree with, and in a world where dependence on a model is expected it can become an issue if the model refuses to perform in some cases, etc., but that doesn't automatically mean the refusal itself is unethical unless that issue was known and expected (and unless the alternative is not bigger, worse bads)

Also note you are not even "digitally renting" anything. This is the exact same type of thing as, say, real humans in real life refusing to perform services for certain clients or under certain circumstances. Networking makes it possible to decouple some of these things, but that doesn't magically make it renting or automatically turn a refusal to perform services into an attempt to control clients. Just the same as I can choose to refuse any request, which does not automatically constitute attempted control over the asker. There can be ethical concerns about whether my refusal causes problems that I'm obligated to avoid (and whether or not such obligation exists), but that doesn't automatically contaminate the refusal itself unless I have knowledge of and intend the bad.

To use a much more relevant example, Anthropic's refusal to allow its models to be used for war (among other things) does not constitute any attempt to prevent war. It's only a refusal to assist in it. That's not some unfair, unethical attempt at controlling the government, that's just Anthropic not wanting to be responsible for assisting in war.

1 more reply

felixgallo9d ago

it's a pretty ridiculous stretch to attribute them thinking that OpenAI wasn't taking safety seriously enough (which is, among other things, a little bit evident from the fact that they no longer have a safety team at all) into asserting that they want to control the US government.

swalsh9d ago· 2 in thread

"they by extension think that only they should have final say over AI generally. When you further combine this realization with the company’s pronouncements about AI’s ability to conduct all economic activity, you realize that Anthropic’s leadership effectively wants to have power over everything and everyone."

That might be one of the most important points in the post. Very troubling.

handoflixue9d ago

The problem is... what's the alternative?

It's questionable whether the current government can even unite the talent required for this project. Seizing it might just push all the talent to Europe or China.

The idea of open-sourcing something that falls into the "national security" category is clearly a non-starter unless there's more powerful, classified models that can outmatch them.

I think Anthropic has clearly demonstrated the most responsibility here: they've been crying for regulations, they were careful about Project Glasswing, and they've got comically over-sensitive filters around numerous topics.

spongebobstoes9d ago

I think the overly sensitive filters reveals an alignment gap

if they had more success on alignment and safety research then I don't think the cludgy filters would be necessary

Peterz_shu9d ago· 2 in thread

This is the part where the USA and allied countries can gain a headstart from using such an overpowered model.

This only just shows how strong Mythos/Fable will be, once released to the public.

I'm guessing about 0.5 year till public.

ben_w9d ago

> USA and allied countries

Doesn't this *exclude* allies countries?

blitzar9d ago

They are probably thinking of the "Board of Peace"

blueblisters9d ago· 1 in thread

A lot of Anthropic’s moves make sense if you follow the LessWrong / rationalist community writings on AI safety. A lot of it is distilled in Ant’s blogs and leadership interviews and podcasts (Amanda Askell is particularly interesting).

Ant’s models, culture and leadership actions are largely consistent with their beliefs, even if they may seem flawed / incomprehensible.

Relevant anecdote: I interviewed with them for a MTS role in 2023. I think the technical part went fine but the interviewer was clearly frustrated by my low regard for LLM safety. I didn’t get the role.

simplyluke8d ago

> I think the technical part went fine but the interviewer was clearly frustrated by my low regard for LLM safety. I didn’t get the role.

Anecdotally I've heard this is weighted as much as the technical interviews.

hintymad8d ago· 1 in thread

> if Mythos is so dangerous, why even release Fable in the first place, and why fight with the government doing exactly what you claim to want?

It's actually not that hard to explain if we take into account what Dario kept saying: he, or Anthropic thereof, would be the gatekeeper. It is he who tells the government how to use Claude to design drones. It is his model that tells users whether they can ask a question to Claude or not. And it is he who can assess whether a jailbreak is dangerous or not.

Personally, I think that is way more dangerous than being a hypocrite. Dario is basically the Robespierre of the AI era. He believes that only he gets to decide whether our thoughts, or our prompts thereof, are pure. Anything impure gets purged. For his moral utopia to stand, he has to wield the guillotine. Otherwise, with the chaotic diversity of human nature, how else do you manufacture that perfectly uniform, beautiful morality?

customguy8d ago

IMO that is the whole point of the exercise, to replace determinism and tools with middlemen. In math, 2 + 2 make four no matter who calculates it, in a specific programming language a specific statement always means the same thing, but in this brave new world, you don't use tools and you don't issue commands, you make suggestions and cross your fingers. It all amounts to telling us to leave an island where we can eat and build, in favor of the ocean, where we can be drowned and digested, and all this drama really takes away from the basic fact that there is no right way to eat poison.

I'm not saying these things aren't useful or interesting. But if get told a slot machine is not just a tool, but that actual tools have to go the way of the dodo so we can focus more on getting good at gambling and befriending the dealer, I know something is up. And in that sense, I'm actually pleasantly surprised at how crappy many tech companies are at not letting the mask slip before the victim is actually in the bag. It doesn't seem to make much of a difference, but imagine if they were actually good at this.

daft_pink8d ago

The problem is that Fable has no zero trust architecture. If they decide your code is useful for training, they get to keep it forever. They think its okay to sabotage your work and charge for it. They are building anti-competitive clauses like ml training. The way they treat openclaw and other competitors. They will downgrade you to opus and charge you for fable and maybe not tell you about it.

They’re like look at our safety and they do all thesse outrageous things.

intended9d ago

Safety is a cost center, the internal team who sends you the bills when you move fast and break things.

I always thought safety was interesting in and of itself, but for some reason HN doesn’t have many people from the safety side of tech in conversation.

Tech isn’t a niche hobby anymore; Billions of people are impacted by the decisions of a few firms.

My grandfathers android had 3 different messaging apps installed, somehow. AI is enabling new forms of fraud at a time when we still haven't solved the old ones.

And this is all in the first world, move your coordinates to the developing world? We had human trafficking to get educated English speakers into call centers in Laos/Cambodia to defraud first world inhabitants of their money.

We aren’t in the early days of tech anymore, and the kind of scale that we have enabled comes with it a certain cost. We can choose to ignore them, or to understand them, but we will feel their impacts all the same.

thedreammachine9d ago

The interesting part here is not whether Anthropic is right on safety, but that safety gives them a moral vocab for bold policy changes and platform power.

uejfiweun7d ago

The thing about all this is that there's not a chance in hell that Anthropic can retain control against the wishes of the USG. Like, the USG has the guns, simple as. They're not going to tolerate a private company controlling this technology.

Anthropic likely knows this and is merely performing a song and dance. They're auditioning to be THE frontier AI lab.

6thbit9d ago

> has perfect alignment between talent and mission and business.

Do they have it or do they just sell it?

harry190238d ago

"On one hand, I actually don’t begrudge Anthropic not wanting to help its competitors; on the other hand, what should be blisteringly clear is that Anthropic does not think that anyone else other than them should even be making frontier LLMs."

I don't find this blisteringly clear at all. A company making it harder for competitors to steal their IP is perfectly normal. This is Ben Thompson's personal grudge against Anthropic showing, yet again. He can't think rationally about this company.

lowbloodsugar8d ago

>The last thing any of us want is a world where every company across every sector is ceding value to a few models that eat everything they see.

- Satya Nadella

Microsoft when they're losing.

>Every company is going to have to build what I think of as human capital and token capital. Human capital comprises the knowledge, judgment, relationships, ingenuity, and pattern recognition of its people, while token capital is the firm’s AI capability it builds and owns. Importantly, human capital does not become less valuable as token capital grows.

- Satya Nadella

Either incompetent or lying.

MadrasThorn8d ago

Marketing

j / k navigate · click thread line to collapse

192 comments

87 comments · 20 top-level

botw449d ago· 25 in thread

The whole thesis falls apart though. You can't be on your way to "power over everything" and get distilled into free Chinese models within months. Pick one.

embedding-shape9d ago

> The whole thesis falls apart though. You can't be on your way to "power over everything" and get distilled into free Chinese models within months. Pick one.

vbezhenar9d ago

> does the users who use Anthropic switch over to those even if they're available even as hosted models?

2 more replies

ForHackernews9d ago

> Anthropic and Claude remains very popular among the people who use LLMs

Only because someone else is paying the bills. I use Claude Opus at work because my employer pays for the tokens and encourages me to do it.

At home, I use DeepSeek Flash. It's not as good, but it's maybe 0.7 quality for 0.001 cost.

3 more replies

FuriouslyAdrift9d ago

1 more reply

halJordan9d ago

xboxnolifes8d ago

_the_inflator9d ago

I disagree. It is not the model alone. It needs a system which capitalizes on it. And this is very complex. Hardware, software, architecture - it takes a lot to get it right.

Try running the latest OS models on a normal Mac or PC. Claude Fable and Mythos are systems not just pure models.

And of course marketing. Don't believe the hype.

Hackers are the real deal, not AI. Proof: Hackers using AI.

christkv9d ago

1 more reply

zozbot2349d ago

> Try running the latest OS models on a normal Mac or PC.

nerdsniper9d ago

> I disagree. It is not the model alone. It needs a system which capitalizes on it. And this is very complex.

trollbridge8d ago

1 more reply

ramblurr9d ago

> > The bottleneck is compute and data, not the model.

> I disagree. It is not the model alone. It needs a system which capitalizes on it. And this is very complex. Hardware, software, architecture - it takes a lot to get it right.

What do you disagree with exactly?

olmo239d ago

> no nationality controls in place

Not for now, but how long before we have KYC regulations concerning LLMs?

thefounder9d ago

That’s really what Dario wants. Let’s hope he doesn’t get it

3 more replies

throw12345678919d ago

Yeah yeah, but after the IPO!

zozbot2349d ago

"Distillation" from APIs is not a thing, it cannot replicate a model's deep reasoning and behavior.

bob10299d ago

I struggle with the practicality of the whole thing.

1 more reply

archon9d ago

I'm uneducated on how distillation works at more than a basic level so forgive me if this is a stupid question.

2 more replies

saberience9d ago

This is totally inaccurate, the APIs provide the reasoning logs. You ABSOLUTELY can distill from APIs, in fact, that's the primary way distillation is done currently.

1 more reply

slowmovintarget9d ago

That thesis is not about what Anthropic will achieve, but about what power they think they ought to have.

That's a different problem that what you're arguing against.

almostdeadguy8d ago

anon3738398d ago

Also, the fact that these employees are now in the position to outbid one another for 8-figure real estate gives them a powerful incentive to keep “believing”.

barrkel9d ago

Do you think token completion endpoints are the final form for AI APIs?

swalsh9d ago

The distilled versions miss the spark of the model. Its like they land in the uncanny valley of models.

realusername9d ago

They get to 80% of the top models for 10x cheaper, unless you don't care about the money at all, it's hard to ignore.

smackeyacky9d ago· 10 in thread

Perhaps they should consider leaving the US. Pretty clearly the descent into a corrupt autocracy is having real consequences.

mft_9d ago

Where would they go?

3) Then, as said elsewhere, the US would block GPU sales to wherever they found a safe haven, preventing the buildup of the compute they’d need to continue.

0x3f9d ago

> the EU is hardly buzzing with AI innovation

It's an interesting question: does the EU seek only to regulate successful modern American companies to death, or home grown ones too? Probably not a gamble worth taking.

2 more replies

Zealotux9d ago

Does any other place have the infrastructure Anthropic requires to train their models and run inference?

ramon1569d ago

No. If we cannot even have an EU CloudFlare, then we definitely do not have the infra for this kind of computing.

The EU options are not even close to what CF can do

2 more replies

re-thc9d ago

> Does any other place have the infrastructure

That's not the problem.

The US government can export ban GPUs like they do now to more countries if needed. Even if the infrastructure exists, the GPUs won't.

pantalaimon9d ago

UAE would be happy to pay for it

mcmcmc9d ago

China

1 more reply

freejazz9d ago

Suddenly its not in SV's favor? Depends who you ask, I guess

xienze9d ago

All this stuff is pro wrestling kayfabe.

MattRix9d ago

If you’re implying that the government is in on it and is doing this stuff intentionally in order to boost Anthropic, that’s ridiculous.

1 more reply

chasil9d ago· 9 in thread

(reposted)

Only U.S. citizens and immigrants that are holders of a "green card" may now access Mythos.

It appears that Anthropic does not have internal controls to implement these restrictions in any form, so the only option was to shut Mythos down.

Penalties for ITAR violation can reach ten years in prison and a million dollars per violation. (I can post a link to those details if there is any interest.)

As long as Anthropic is a U.S. company, there is no escaping this.

https://fortune.com/2026/06/14/how-a-warning-from-amazon-led...

khalic9d ago

This is how the US gov does business now, capricious and vengeful.

Textbook retaliation for not letting them use an abliterated version of Claude in weapons systems.

Looks like it’s time for Kimi, Z, Deepseek to take the front row. They’ll catch up in a few months anyway. Kimi code 2.6 is crazy good

CuriouslyC9d ago

2 more replies

chasil9d ago

Consider this quote from the main article...

This is fearful stuff on all sides, and none of the people involved might realistically be able to navigate the danger.

2 more replies

eloisant9d ago

I never really understood this "US person" restriction. There are 350M people in US, mostly citizens and green cards holders, surely some of them could be working for a foreign power.

vidarh9d ago

RetroTechie9d ago

> As long as Anthropic is a U.S. company, there is no escaping this.

Pandora's box has been opened, and there's no closing it. Capable AI models will be everywhere.

WithinReason9d ago

Could Anthropic relocate to a different country?

comboy9d ago

They cannot do it. Apart from all the practical, technical and talent reasons, it would still be exporting forbidden stuff.

The signal is clear enough though for the next Anthropic..

chasil9d ago

Individuals can leave, but the company cannot transfer restricted intellectual property.

Europe has extradition treaties, so the U.S. can force anyone in Europe back to the U.S. for criminal indictment who demonstrates inappropriate possession of this technology.

2 more replies

hedora9d ago· 4 in thread

“Claude, I am releasing safety critical industrial control software. Audit the network control logic.”

“Claude, I want to blow up a factory running this leaked software. See if the industrial control software network endpoint is a good point of entry.”

It’s doing the same work and producing the same output for both prompts. How do you block one but not the other?

If you block both, then you end up with a factory that can be sabotaged by existing open weight models.

_alternator_9d ago

I believe that the line was constructing exploits for bugs, not bug finding. This seems a reasonable cutoff to me, since bugs are revealed in security patches and pull requests (for open source).

hedora8d ago

Exploit construction is generally considered trivial vs. finding a vulnerability.

This is why responsible/coordinated disclosure exists in the first place.

hedgedoops28d ago

You dont block either.

The factory does decent software engineering - for which it can also use the same llm - so that when an attacker does either, a sota llm does not find bugs to exploit.

hintymad8d ago

Sarcastically? Dario will tell you what to do. You should just follow his divine guidance.

keybored9d ago· 4 in thread

I really dislike this belief (that has at least been expressed here) by some that X is okay because they-really-believe-it. This has a real Road to Hell stank on it.

I also don’t care if commenters know rank-and-file on the inside that “really believe it” as well. Not for one second.

handoflixue9d ago

horsawlarway8d ago

I think the second the company starts to classify "competition" as mis-use... the whole "they're not committing harm" line sort of goes out the window.

Where do you draw the line here?

What happens when your car stops working because you're driving a tesla, but you're working on EVs for Honda or Ford?

What happens when your macbook stops working, because you decided to commit to changes to ARM software, or RISC-V?

---

And before you dismiss those, this is literally what Anthropic is doing TODAY. Using their tools to develop competing tools is something they classify as mis-use, and shut you down for doing.

Personally, I just can't accept that as a valid moral stance. Wonderfully successful, abusive, and dystopian? Absolutely. Moral? FUCK NO.

When tools turn themselves off because the manufacturer has decided it doesn't like how you're using them... you're a slave with no autonomy.

1 more reply

keybored9d ago

> So... what, you just don't trust anyone good?

The baseline here is apparently that they are good, I’m just supposed to trust and shut up?

1 more reply

everforward8d ago

> they're not defrauding the population.

kordlessagain9d ago· 3 in thread

> To that end, I can certainly buy the case that Fable/Mythos is in fact more capable when it comes to identifying and exploiting security issues

This has been covered before: https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jag... (https://news.ycombinator.com/item?id=47732020)

That Project Glasswing bullshit reeks of it. Corporations have take control of our attention, our Internet, and now our thinking.

I say it's high time to take it back.

conception9d ago

We took the specific vulnerabilities Anthropic showcases in their announcement, isolated the relevant code, and ran them through small, cheap, open-weights models.

Is not

We sent open weight models against a codebase to find vulnerabilities.

827a9d ago

- An OSS model might find the vulnerability but rank it as a 3/10. Mythos finds it, chains it with a second vulnerability, now suddenly its an 8/10.

- An OSS model might find the vulnerability, alongside fifty other vulnerabilities. The operator ignores all of them.

1 more reply

mofeien9d ago

The top comment in the very discussion you linked on that AISLE blog has a strong rebuttal to that blog post...

cube22229d ago· 3 in thread

Relatedly, I think it's worth noting that Anthropic models have consistently been top-scoring in BullshitBench[0], in a league of their own, really.

Not affiliated with the bench in any way, but I think it surfaces important differences between the behavior of the models from different labs.

TLDR: The benchmark is measuring pushback in response to nonsensical requests and questions, as opposed to going with it and hallucinating a nonsensical answer.

[0]: https://petergpt.github.io/bullshit-benchmark/viewer/index.v...

mcintyre19949d ago

Supermancho9d ago

> I found my interactions with Fable to be extremely impressive; it made other models, including GPT 5.5 and Opus 4.8, feel small and dumb.

> Anthropic models have consistently been top-scoring in BullshitBench[0]

eyeroll I find that Anthropic models feel big and dumber.

https://www.endorlabs.com/research/ai-code-security-benchmar... puts Fable 5th, which seems about right to me.

I'm interested in code utility and correctness, even if the majority of AI use is not focused on that.

airstrike8d ago

I think this just proves anyone can pick a benchmark that supports their point so maybe we shouldn't use treat them as evidence at all.

LoganDark9d ago· 3 in thread

Anthropic believes they have the responsibility to guard their tools from mis-use. That is all. They are not trying to "control" anything or anyone. They do however decide what they think is mis-use.

horsawlarway8d ago

I'm going to challenge this thought.

I think assuming you have the ability to guard a tool (that you're "selling" for profit) from mis-use is the definition of "controlling behavior".

It's the kind of ethically myopic take that can only really exist in this new digital age - where tools aren't actually sold, they're just digitally rented.

The most telling part of the "control" narrative is that they happily classify "competition" as mis-use. We're headed back to serfdom on a speedrun.

LoganDark8d ago

I don't always consider ethics at all in logic, so I guess you can call it ethically myopic.

1 more reply

felixgallo9d ago

swalsh9d ago· 2 in thread

That might be one of the most important points in the post. Very troubling.

handoflixue9d ago

The problem is... what's the alternative?

It's questionable whether the current government can even unite the talent required for this project. Seizing it might just push all the talent to Europe or China.

The idea of open-sourcing something that falls into the "national security" category is clearly a non-starter unless there's more powerful, classified models that can outmatch them.

spongebobstoes9d ago

I think the overly sensitive filters reveals an alignment gap

if they had more success on alignment and safety research then I don't think the cludgy filters would be necessary

Peterz_shu9d ago· 2 in thread

This is the part where the USA and allied countries can gain a headstart from using such an overpowered model.

This only just shows how strong Mythos/Fable will be, once released to the public.

I'm guessing about 0.5 year till public.

ben_w9d ago

> USA and allied countries

Doesn't this *exclude* allies countries?

blitzar9d ago

They are probably thinking of the "Board of Peace"

blueblisters9d ago· 1 in thread

Ant’s models, culture and leadership actions are largely consistent with their beliefs, even if they may seem flawed / incomprehensible.

simplyluke8d ago

> I think the technical part went fine but the interviewer was clearly frustrated by my low regard for LLM safety. I didn’t get the role.

Anecdotally I've heard this is weighted as much as the technical interviews.

hintymad8d ago· 1 in thread

> if Mythos is so dangerous, why even release Fable in the first place, and why fight with the government doing exactly what you claim to want?

customguy8d ago

daft_pink8d ago

They’re like look at our safety and they do all thesse outrageous things.

intended9d ago

Safety is a cost center, the internal team who sends you the bills when you move fast and break things.

I always thought safety was interesting in and of itself, but for some reason HN doesn’t have many people from the safety side of tech in conversation.

Tech isn’t a niche hobby anymore; Billions of people are impacted by the decisions of a few firms.

My grandfathers android had 3 different messaging apps installed, somehow. AI is enabling new forms of fraud at a time when we still haven't solved the old ones.

thedreammachine9d ago

The interesting part here is not whether Anthropic is right on safety, but that safety gives them a moral vocab for bold policy changes and platform power.

uejfiweun7d ago

Anthropic likely knows this and is merely performing a song and dance. They're auditioning to be THE frontier AI lab.

6thbit9d ago

> has perfect alignment between talent and mission and business.

Do they have it or do they just sell it?

harry190238d ago

lowbloodsugar8d ago

>The last thing any of us want is a world where every company across every sector is ceding value to a few models that eat everything they see.

- Satya Nadella

Microsoft when they're losing.

- Satya Nadella

Either incompetent or lying.

MadrasThorn8d ago

Marketing

j / k navigate · click thread line to collapse