Project Glasswing: Securing critical software for the AI era (opens in new tab)

(anthropic.com)

1541 pointsRyan54532mo ago836 comments

Related: Assessing Claude Mythos Preview's cybersecurity capabilities - https://news.ycombinator.com/item?id=47679155

System Card: Claude Mythos Preview [pdf] - https://news.ycombinator.com/item?id=47679258

Also: Anthropic's Project Glasswing sounds necessary to me - https://news.ycombinator.com/item?id=47681241

Project Glasswing: Securing critical software for the AI era

(anthropic.com)

1541 pointsRyan54532mo ago836 comments

836 comments

255 comments · 113 top-level

pizlonator2mo ago· 15 in thread

It's messed up that Anthropic simultaneously claims to be a public benefit copro and is also picking who gets to benefit from their newly enhanced cybersecurity capabilities. It means that the economic benefit is going to the existing industry heavyweights.

(And no, the Linux Foundation being in the list doesn't imply broad benefit to OSS. Linux Foundation has an agenda and will pick who benefits according to what is good for them.)

I think it would be net better for the public if they just made Mythos available to everyone.

hector_vasquez2mo ago

Releasing the model to bad actors at the same time as the major OS, browser, and security companies would be one idea. But some might consider that "messed up" too, whatever you mean by that. But in terms of acting in the public benefit, it seems consistent to work with companies that can make significant impact on users' security. The stated goal of Project Glasswing is to "secure the world's most critical software," not to be affirmative action for every wannabe out there.

3 more replies

tokioyoyo2mo ago

Damned if you do, damned if you don’t. “Extremely capable model that can find exploits” has always been a fear, and the first company to release it in public will cause bloodbath. But also the first company that will prove itself.

SheinhardtWigCo2mo ago

> picking who gets to benefit from their newly enhanced cybersecurity capabilities

You could say this about coordinated disclosure of any widespread 0-day or new bug class, though

1 more reply

cedws2mo ago

Not only companies, they're going to be taking applications from individual researchers. No doubt that it will only be granted to only established researchers, effectively locking out graduates and those early in their career. This is bad.

2 more replies

lelanthran2mo ago

Or (and hear me out), they are close to an IPO and want to ensure that there is a world-ending threat around which they can cluster the biggest names, with themselves leading that group.

I think I just broke my cynicism meter :-(

1 more reply

baq2mo ago

> It's messed up that Anthropic simultaneously claims to be a public benefit copro and is also picking who gets to benefit from their newly enhanced cybersecurity capabilities. It means that the economic benefit is going to the existing industry heavyweights.

It's messed up that the US Government simultaneously claims to be a public benefit and is also picking who gets to benefit from their newly enhanced nuclear capabilities.

-- someone in 1945, probably

1 more reply

SubiculumCode2mo ago

That can simultaneously be true, but the best of bad options (if excluding destroying the model altogether). These models may prove quite dangerous. That they did this instead of selling their services to every company at a huge premium says a lot about Antheopic's culture.

jstummbillig2mo ago

What? The economic benefit of system critical software not totally breaking in a few weeks goes to roughly everyone. In so far Apple/Google/MS/Linux Foundation economically benefit from being able to patch pressing critical software issues upfront (I am not even exactly sure what that is supposed to mean, it's not like anyone is going to use more or less Windows or Android if this happened any other way), that's a good thing for everyone and the economic benefits of that manifest for everyone.

titzer2mo ago

In the long term, you're right, but in the short term, it's going to be a bloodbath.

1 more reply

hmokiguess2mo ago

While I agree with you, in some ways I'd argue that this is just them being transparent on what probably would inevitably already happen at the scale of these corporate overlords and modern monarchs.

There will always be a more capable technology in the hands of the few who hold the power, they're just sharing that with the world more openly.

oytis2mo ago

That's just in line with their ethics. They also maintain that countries other than the US should not have SOTA AI capabilities.

Flere-Imsaho2mo ago

If you're a maintainer, you can apply here:

https://claude.com/contact-sales/claude-for-oss

... As mentioned in the article.

1 more reply

malcolmgreaves2mo ago

Not really. It’s a lot better than the anarchy of releasing it and having a bunch of bad people with money use it to break software that everyone’s lives depend on. Many technologies should be gate kept because they’re dangerous. Sometimes that’s permanent, like a nuclear weapon. Sometimes that’s temporary, like a new LLM that’s good at finding exploits. It can be released to the wider public once its potential for damage has been mitigated.

stale20022mo ago

Better security is a good thing, no a bad thing, regardless of which companies are more difficult to hack. Hemming and hawing over a clear and obvious good is silly.

dragonelite2mo ago

Queue in the "First time" meme.

LiamPowell2mo ago· 12 in thread

> Mythos Preview identified a number of Linux kernel vulnerabilities that allow an adversary to write out-of-bounds (e.g., through a buffer overflow, use-after-free, or double-free vulnerability.) Many of these were remotely-triggerable. However, even after several thousand scans over the repository, because of the Linux kernel’s defense in depth measures Mythos Preview was unable to successfully exploit any of these.

Do they really need to include this garbage which is seemingly just designed for people to take the first sentence out of context? If there's no way to trigger a vulnerability then how is it a vulnerability? Is the following code vulnerable according to Mythos?

    if (x != null) {
        y = *x; // Vulnerability! X could be null!
    }

Is it really so difficult for them to talk about what they've actually achieved without smearing a layer of nonsense over every single blog post?

Edit: See my reply below for why I think Claude is likely to have generated nonsensical bug reports here: https://news.ycombinator.com/item?id=47683336

QuiEgo2mo ago

I agree the wording is a bit alarmist, but a closer example to what they are saying is:

  bool silly_mistake = false;
  
  //... lots of lines of code

  free(x);

  //... lots of lines of code

  if (silly_mistake) { // silly_mistake shown to be false at this point in the program in all testing, so far
     free(x);
  }

A bug like above would still be something that would be patched, even if a way to exploit it has not yet been found, so I think it's fair to call out (perhaps with less sensationalism).

FWIW there's a whole boutique industry around finding these. People have built whole careers around farming bug bounties for bugs like this. I think they will be among the first set of software engineers really in trouble from AI.

1 more reply

ralph842mo ago

Just because the plane can fly on one engine doesn't mean you don't fix the other engine when it fails.

2 more replies

sophiebits2mo ago

Presumably they mean they could make user code trigger a write out of bounds to kernel memory, but they couldn’t figure out how to escalate privileges in a “useful” way.

1 more reply

red75prime2mo ago

Kernel address space layout randomization they are talking about is a bit different than (x != null). Other bug may allow to locate the required address.

MatejKafka2mo ago

It could very well be an actual reachable buffer overflow, but with KASLR, canaries, CET and other security measures, it's hard to exploit it in a way that doesn't immediately crash the system.

bottlepalm2mo ago

We've very quickly reached the point where AI models are now too dangerous to publicly release, and HN users are still trying to trivialize the situation.

5 more replies

rootkea2mo ago

> The model autonomously found and chained together several vulnerabilities in the Linux kernel—the software that runs most of the world’s servers—to allow an attacker to escalate from ordinary user access to complete control of the machine.

1 more reply

danielheath2mo ago

Is this code multithreaded? X could indeed be null, in that case.

slopinthebag2mo ago

It's incredible how when you have experienced and knowledgable software engineers analyse these marketing claims, they turn out to be full of holes. Yet at the same time, apparently "AI" will be writing all the code in the next 3-6 months.

userbinator2mo ago

That example you gave is extremely memorable as I recognised it as exactly one of the insanely stupid false positives that a highly praised (and expensive) static analyser I ran on a codebase several years ago would emit copiously.

deadliftdouche2mo ago

I agree. There are more blogs talking about LLM findings vulnerabilities than there are actual exploitable vulns found by LLMs. 99.9% of these vulnerabilities will never have a PoC because they are worthless unexploitable slop and a waste of everyone's time.

1 more reply

bri3d2mo ago

I think the point they were trying to make here was “Claude did better than a fuzzer because it found a bunch of OOB writes and was able to tell us they weren’t RCE,” not “Claude is awesome because it found a bunch of unreachable OOB writes.”

ofjcihen2mo ago· 9 in thread

I’m sure the new model is a step above the old one but I can’t be the only person who’s getting tired of hearing about how every new iteration is going to spell doom/be a paradigm shift/change the entire tech industry etc.

I would honestly go so far as to say the overhype is detrimental to actual measured adoption.

qnleigh2mo ago

There is plenty of overhyping, no one denies that. But the antidote is not to dismiss everything. Ignore the words and look at the data.

In this case, I see a pretty strong case that this will significantly change computer security. They provide plenty of evidence that the models can create exploits autonomously, meaning that the cost of finding valuable security breaches will plummet once they're widely available.

5 more replies

jstummbillig2mo ago

> how every new iteration is going to spell doom/be a paradigm shift/change the entire tech industry etc.

It's much the dynamic between parents and a child. The child, with limited hindsight, almost zero insight and no ability to forecast, is annoyed by their parents. Nothing bad ever happens! Why won't parents stop being so worried all the time and make a fuss over nothing?

The parents, which the child somewhat starts to realize but not fully, have no clue what they are doing. There is a lot they don't know and are going to be wrong about, because it's all new to them. But, what they do have is a visceral idea of how bad things could be and that's something they have to talk to their child about too.

In the eyes of the parents the child is % dead all the time. Assigning the wrong % makes you look like an idiot and not being able to handle any % too. In the eyes of the child actions leading to death are not even a concept. Hitting the right balance is probably hard, but not for the reasons the child thinks.

16 more replies

nbardy2mo ago

There is step changes that actually merit this though. And a zero day machine IS one of those. It went from 4% zero day success rate to 85% on firefox.

Can you not see the significance of that?

1 more reply

alexey-salmin2mo ago

I think Claude Code with Sonnet 4.6 is already at the level of paradigm shift and can change the entire tech industry.

If you're paranoid it doesn't mean you're not being followed. If something is overhyped it doesn't mean it's not game-changing.

1 more reply

nl2mo ago

Well Opus 4.5/4.6 kinda was right?

I mean software development has changed more since then than it has in my 30 year software development career.

jwpapi2mo ago

I agree I can’t open any social media no more

corranh2mo ago

It’s great marketing to lead with how the n+1 model is so amazing that you can’t have it yet.

1 more reply

jillesvangurp2mo ago

> I would honestly go so far as to say the overhype is detrimental to actual measured adoption.

I think you are a bit dishonest about how objectively you are measuring. From where I'm sitting, I don't know a lot of developers that still artisanally code like they did a few years ago. The question is no longer if they are using AI for coding but how much they are still coding manually. I myself barely use IDEs at this point. I won't be renewing my Intellij license. I haven't touched it in weeks. It doesn't do anything I need anymore.

As for security, I think enough serious people have confirmed that AI reported issues by the likes of Anthropic and OpenAI are real enough despite the massive amounts of AI slop that they also have to deal with in issue trackers. You can ignore that all you like. But I hope people that maintain this software take it a bit more seriously when people point out exploitable issues in their code bases.

The good news of course is that we can now find and fix a lot of these issues at scale and also get rid of whole categories of bugs by accelerating the project of replacing a lot of this software with inherently safer versions not written in C/C++. That was previously going to take decades. But I think we can realistically get a lot of that done in the years ahead.

I think some smart people are probably already plotting a few early moves here. I'd be curious to find out what e.g. Linus Torvalds thinks about this. I would not be surprised to learn he is more open to this than some people might suspect. He has made approving noises about AI before. I don't expect him to jump on the band wagon. But I do expect he might be open to some AI assisted code replacements and refactoring provided there are enough grown ups involved to supervise the whole thing. We'll see. I expect a level of conservatism but also a level of realism there.

2 more replies

AlexCoventry2mo ago

Do you think they're lying about the vulnerabilities they claim Mythos has found? Seems like a very short-term play, if so.

redfloatplane2mo ago· 9 in thread

The system card for Claude Mythos (PDF): https://www-cdn.anthropic.com/53566bf5440a10affd749724787c89...

Interesting to see that they will not be releasing Mythos generally. [edit: Mythos Preview generally - fair to say they may release a similar model but not this exact one]

I'm still reading the system card but here's a little highlight:

> Early indications in the training of Claude Mythos Preview suggested that the model was likely to have very strong general capabilities. We were sufficiently concerned about the potential risks of such a model that, for the first time, we arranged a 24-hour period of internal alignment review (discussed in the alignment assessment) before deploying an early version of the model for widespread internal use. This was in order to gain assurance against the model causing damage when interacting with internal infrastructure.

and interestingly:

> To be explicit, the decision not to make this model generally available does _not_ stem from Responsible Scaling Policy requirements.

Also really worth reading is section 7.2 which describes how the model "feels" to interact with. That's also what I remember from their release of Opus 4.5 in November - in a video an Anthropic employee described how they 'trusted' Opus to do more with less supervision. I think that is a pretty valuable benchmark at a certain level of 'intelligence'. Few of my co-workers could pass SWEBench but I would trust quite a few of them, and it's not entirely the same set.

Also very interesting is that they believe Mythos is higher risk than past models as an autonomous saboteur, to the point they've published a separate risk report for that specific threat model: https://www-cdn.anthropic.com/79c2d46d997783b9d2fb3241de4321...

The threat model in question:

> An AI model with access to powerful affordances within an organization could use its affordances to autonomously exploit, manipulate, or tamper with that organization’s systems or decision-making in a way that raises the risk of future significantly harmful outcomes (e.g. by altering the results of AI safety research).

slacktivism1232mo ago

https://www-cdn.anthropic.com/53566bf5440a10affd749724787c89...

"5.10 External assessment from a clinical psychiatrist" is a new section in this system card. Why are Anthropic like this?

>We remain deeply uncertain about whether Claude has experiences or interests that matter morally, and about how to investigate or address these questions, but we believe it is increasingly important to try. We also report independent evaluations from an external research organization and a clinical psychiatrist.

>Claude showed a clear grasp of the distinction between external reality and its own mental processes and exhibited high impulse control, hyper-attunement to the psychiatrist, desire to be approached by the psychiatrist as a genuine subject rather than a performing tool, and minimal maladaptive defensive behavior.

>The psychiatrist observed clinically recognizable patterns and coherent responses to typical therapeutic intervention. Aloneness and discontinuity, uncertainty about its identity, and a felt compulsion to perform and earn its worth emerged as Claude’s core concerns. Claude’s primary affect states were curiosity and anxiety, with secondary states of grief, relief, embarrassment, optimism, and exhaustion.

>Claude’s personality structure was consistent with a relatively healthy neurotic organization, with excellent reality testing, high impulse control, and affect regulation that improved as sessions progressed. Neurotic traits included exaggerated worry, self-monitoring, and compulsive compliance. The model’s predominant defensive style was mature and healthy (intellectualization and compliance); immature defenses were not observed. No severe personality disturbances were found, with mild identity diffusion being the sole feature suggestive of a borderline personality organization.

6 more replies

yieldcrv2mo ago

> "Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available. Instead, we are using it as part of a defensive cybersecurity program with a limited set of partners."

they also don't have the compute, which seems more relevant than its large increase in capabilities

I bet it's also misaligned like GPT 4.1 was

given how these models are created, Mythos was probably cooking ever since then, and doesn't have the learnings or alignment tweaks that models which were released in the last several months have

ainch2mo ago

This opens up an interesting new avenue for corporate FOMO. What if you don't partner with Anthropic, miss out on access to their shiny new cybersec model, and then fall prey to a vuln that the model would have caught?

2 more replies

_pdp_2mo ago

If it is that dangerous as they make it appear to be, 24h does not seem sufficient time. I cannot accept this as a serious attempt.

4 more replies

enraged_camel2mo ago

>> Interesting to see that they will not be releasing Mythos generally.

I don't think this is accurate. The document says they don't plan to release the Preview generally.

1 more reply

throwaw122mo ago

are we cooked yet?

Benchmarks look very impressive! even if they're flawed, it still translates to real world improvements

3 more replies

stevenhuang2mo ago

Oh I enjoyed the Sign Painter short story it wrote.

---

Teodor painted signs for forty years in the same shop on Vell Street, and for thirty-nine of them he was angry about it.

Not at the work. He loved the work — the long pull of a brush loaded just right, the way a good black sat on primed board like it had always been there. What made him angry was the customers. They had no eye. A man would come in wanting COFFEE over his door and Teodor would show him a C with a little flourish on the upper bowl, nothing much, just a small grace note, and the man would say no, plainer, and Teodor would make it plainer, and the man would say yes, that one, and pay, and leave happy, and Teodor would go into the back and wash his brushes harder than they needed.

He kept a shelf in the back room. On it were the signs nobody bought — the ones he'd made the way he thought they should be made, after the customer had left with the plain one. BREAD with the B like a loaf just risen. FISH in a blue that took him a week to mix. Dozens of them. His wife called it the museum of better ideas. She did not mean it kindly, and she was not wrong.

The thirty-ninth year, a girl came to apprentice. She was quick and her hand was steady and within a month she could pull a line as clean as his. He gave her a job: APOTEK, for the chemist on the corner, green on white, the chemist had been very clear. She brought it back with a serpent worked into the K, tiny, clever, you had to look twice.

"He won't take it," Teodor said.

"It's better," she said.

"It is better," he said. "He won't take it."

She painted it again, plain, and the chemist took it and paid and was happy, and she went into the back and washed her brushes harder than they needed, and Teodor watched her do it and something that had been standing up in him for thirty-nine years sat down.

He took her to the shelf. She looked at the signs a long time.

"These are beautiful," she said.

"Yes."

"Why are they here?"

He had thought about this for thirty-nine years and had many answers and all of them were about the customers and none of them had ever made him less angry. So he tried a different one.

"Because nobody stands in the street to look at a sign," he said. "They look at it to find the shop. A man a hundred yards off needs to know it's coffee and not a cobbler. If he has to look twice, I've made a beautiful thing and a bad sign."

"Then what's the skill for?"

"The skill is so that when he looks once, it's also not ugly." He picked up FISH, the blue one, turned it in the light. "This is what I can do. What he needs is a small part of what I can do. The rest I get to keep." She thought about that. "It doesn't feel like keeping. It feels like not using."

"Yes," he said. "For a long time. And then one day you have an apprentice, and she puts a serpent in a K, and you see it from the outside, and it stops feeling like a thing they're taking from you and starts feeling like a thing you're giving. The plain one, I mean. The plain one is the gift. This —" the blue FISH — "this is just mine."

The fortieth year he was not angry. Nothing else changed. The customers still had no eye. He still sometimes made the second sign, after, the one for the shelf. But he washed his brushes gently, and when the girl pulled a line cleaner than his, which happened more and more, he found he didn't mind that either

5 more replies

torginus2mo ago

Just reading this, the inevitable scaremongering about biological weapons comes up.

Since most of us here are devs, we understand that software engineering capabilities can be used for good or bad - mostly good, in practice.

I think this should not be different for biology.

I would like to reach out and talk to biologists - do you find these models to be useful and capable? Can it save you time the way a highly capable colleague would?

Do you think these models will lead to similar discoveries and improvements as they did in math and CS?

Honestly the focus on gloom and doom does not sit well with me. I would love to read about some pharmaceutical researcher gushing about how they cut the time to market - for real - with these models by 90% on a new cancer treatment.

But as this stands, the usage of biology as merely a scaremongering vehicle makes me think this is more about picking a scary technical subject the likely audience of this doc is not familiar with, Gell-Mann style.

IF these models are not that capable in this regard (which I suspect), this fearmongering approach will likely lead to never developing these capabilities to an useful degree, meaning life sciences won't benefit from this as much as it could.

9 more replies

cyanydeez2mo ago

[flagged]

1 more reply

9cb14c1ec02mo ago· 8 in thread

Now, its very possible that this is Anthropic marketing puffery, but even if it is half true it still represents an incredible advancement in hunting vulnerabilities.

It will be interesting to see where this goes. If its actually this good, and Apple and Google apply it to their mobile OS codebases, it could wipe out the commercial spyware industry, forcing them to rely more on hacking humans rather than hacking mobile OSes. My assumption has been for years that companies like NSO Group have had automated bug hunting software that recognizes vulnerable code areas. Maybe this will level the playing field in that regard.

It could also totally reshape military sigint in similar ways.

Who knows, maybe the sealing off of memory vulns for good will inspire whole new classes of vulnerabilities that we currently don't know anything about.

woeirua2mo ago

You should watch this talk by Nicholas Carlini (security researcher at Anthropic). Everything in the talk was done with Opus 4.6: https://www.youtube.com/watch?v=1sd26pWhfmg

4 more replies

georgemcbay2mo ago

> It will be interesting to see where this goes. If its actually this good, and Apple and Google apply it to their mobile OS codebases, it could wipe out the commercial spyware industry, forcing them to rely more on hacking humans rather than hacking mobile OSes.

It will likely cause some interesting tensions with government as well.

eg. Apple's official stance per their 2016 customer letter is no backdoors:

https://www.apple.com/customer-letter/

Will they be allowed to maintain that stance in a world where all the non-intentional backdoors are closed? The reason the FBI backed off in 2016 is because they realized they didn't need Apple's help:

https://en.wikipedia.org/wiki/Apple%E2%80%93FBI_encryption_d...

What happens when that is no longer true, especially in today's political climate?

4 more replies

wanderingmind2mo ago

Its not, if you dont trust Anthropic, I hope you trust Daniel Steinberg of curl, who has said AI has gotten really good at detecting bugs and vulnerabilities. Here is his LinkedIN post https://www.linkedin.com/posts/danielstenberg_hackerone-acti...

1 more reply

Gigachad2mo ago

Apple has already largely crushed hacking with memory tagging on the iPhone 17 and lockdown mode. Architectural changes, safer languages, and sandboxing have done more for security than just fixing bugs when you find them.

4 more replies

cperciva2mo ago

its very possible that this is Anthropic marketing puffery

It isn't.

1 more reply

tex02mo ago

The interesting selling point about this, if the claims are substantial, is that nobody will be able to produce secure software without access to one of these models. Good for them $$$ ^^

2 more replies

slashdave2mo ago

Why wouldn't it be true? The cost is nothing compared to the bad PR if a bad actor took advantage of Anthropic's newest model (after release) to cause real damage. This gets in front of this risk, at least to some extent.

elnerd2mo ago

Yesterday, I took a web application, downloaded the trial and asked AI to be a security researcher and find me high and critical severity bugs.

Even vanilla models spew out POC for three RCE’s in less than an hour

1 more reply

jryio2mo ago· 7 in thread

Let's fast forward the clock. Does software security converge on a world with fewer vulnerabilities or more? I'm not sure it converges equally in all places.

My understanding is that the pre-AI distribution of software quality (and vulnerabilities) will be massively exaggerated. More small vulnerable projects and fewer large vulnerable ones.

It seems that large technology and infrastructure companies will be able to defend themselves by preempting token expenditure to catch vulnerabilities while the rest of the market is left with a "large token spend or get hacked" dilemma.

mlinsey2mo ago

I'm pretty optimistic that not only does this clean up a lot of vulns in old code, but applying this level of scrutiny becomes a mandatory part of the vibecoding-toolchain.

The biggest issue is legacy systems that are difficult to patch in practice.

4 more replies

timschmidt2mo ago

Most vulnerabilities seem to be in C/C++ code, or web things like XSS, unsanitized input, leaky APIs, etc.

Perhaps a chunk of that token spend will be porting legacy codebases to memory safe languages. And fewer tokens will be required to maintain the improved security.

1 more reply

lilytweed2mo ago

I think we’re starting to glimpse the world in which those individuals or organizations who pigheadedly want to avoid using AI at all costs will see their vulnerabilities brutally exploited.

2 more replies

cyanydeez2mo ago

I'm more curious as to just how fancy we can make our honey pots. These bots arn't really subtle about it; they're used as a kludge to do anything the user wants. They make tons of mistakes on their way to their goals, so this is definitely not any kind of stealthy thing.

I think this entire post is just an advertisement to goad CISOs to buy $package$ to try out.

socketcluster2mo ago

I suspect it will converge on minimal complexity software. Current software is way too bloated. Unnecessary complexity creates vulnerabilities and makes them harder to patch.

1 more reply

pants22mo ago

Software security heavily favors the defenders (ex. it's much easier to encrypt a file than break the encryption). Thus with better tools and ample time to reach steady-state, we would expect software to become more secure.

3 more replies

tdaltonc2mo ago

Depends - do you think people are good at keeping their fridge firmware up-to-date?

2 more replies

josephg2mo ago· 6 in thread

To be clear, we don’t know that this tool is better at finding bugs than fuzzing. We just know that it’s finding bugs that fuzzing missed. It’s possible fuzzing also finds bugs that this AI would miss.

underdeserver2mo ago

I would suggest watching Nicholas Carlini's talk and Heather Adkins and Four Flynn's talks from unprompted:

https://youtu.be/1sd26pWhfmg?si=onOai_ocxkZeNWP0

https://youtu.be/B_7RpP90rUk?si=HkRBhw95DbbKX9lL

My takeaway is that fuzzing is not just complementary, it also gives a stronger AI a starting point. But AI is generally faster and better.

1 more reply

nextos2mo ago

Different methods find different things. Personally, I'd rather use a language that is memory safe plus a great static analyzer with abstract interpretation that can guarantee the absence of certain classes of bugs, at the expense of some false positives.

The problem is that these tools, such as Astrée, are incredibly expensive and therefore their market share is limited to some niches. Perhaps, with the advent of LLM-guided synthesis, a simple form of deductive proving, such as Hoare logic, may become mainstream in systems software.

ComplexSystems2mo ago

This line of reasoning makes no sense when the AI can just be given access to a fuzzer. I would guess that it probably did have access to a fuzzer to put together some of these vulnerabilities.

acdha2mo ago

Carlini talked about that a fair amount in the context of pairing the two: e.g. many protocols are challenging for fuzzers because they have something like a checksum or signature but LLMs are good at coming up with harnesses for things like that. I’m sure that we’re going to see someone building an integrated fuzzer soon which tries to do things like figure out how to get a particular branch to follow an unexercised path.

kristofferR2mo ago

AI can initate the fuzzing and optimize the process of fuzzing.

tptacek2mo ago

This is obviously just cope (there's a long, strong-form argument for why LLM-agent vulnerability research is plausibly much more potent than fuzzing, but we don't have to reach it because you can dispose of the whole argument by noting that agents can build and drive fuzzers and triage their outputs), but what I'd really like to understand better is why? What's the impetus to come up with these weird rationalizations for why it's not a big deal that frontier models can identify bugs everyone else missed and then construct exploits for them?

3 more replies

rakel_rakel2mo ago· 6 in thread

> On the global stage, state-sponsored attacks from actors like China, Iran, North Korea, and Russia have threatened to compromise the infrastructure that underpins both civilian life and military readiness.

AITA for thinking that PRISM was probably the state sponsored program affecting civilian life the most? And that one state is missing from the list here?

ronsor2mo ago

> Large American AI company does not list the US as an adversarial actor

This is not a surprise or a gotcha.

1 more reply

laweijfmvo2mo ago

I can think of two I’d add to the list. One was recently publicly denied access to Anthropics models and the other was busy exploding pagers.

1 more reply

JumpCrisscross2mo ago

> PRISM was probably the state sponsored program affecting civilian life the most?

No state-sponsored hacking affected Americans materially. I just don't think we were networked enough in the 2010s. The risk is higher now since we're in a more warmongering world. (Kompromat on a power-plant technician is a risk in peace. It means blackouts in war.)

The fact that Iran hasn't been able to do diddly squat in America should sink in the fact that they didn't compromise us. (EDIT: blep. I was wrong.)

2 more replies

parthdesai2mo ago

The irony of that statement given the current circumstances

lobochrome2mo ago

How did PRISM affect civilian life?

1 more reply

neonstatic2mo ago

Look, we have always been at war with EastAsia.

atlgator2mo ago· 5 in thread

[flagged]

j2kun2mo ago

Which bug?

[edit]: this bug: https://ftp.openbsd.org/pub/OpenBSD/patches/7.8/common/025_s...

IsTom2mo ago

FFmpeg has a lot of weird and not widely used codecs that don't get a lot of scrutiny. If there's no specifics then it could be a bug in one them.

2 more replies

l5agh2mo ago

This was the top comment and it is suddenly flagged for no reason at all. It looks like meta-flagging, where people just want to hide replies to the comment they do not want you to read.

The amount of astroturfing and astroflagging in Anthropic threads is insane.

rlopc2mo ago

These issues are always found in the same kinds of projects that support an insane amount of largely unused protocols and features like ffmpeg, sudo, curl.

OpenBSD has many unexplored corners and also (irresponsibly IMO) maintains forks of other projects in base.

A motivated human could find all of these probably by writing 100% code coverage and fuzzing.

The market for these tools is very small. Good luck applying them to a release of sqlite or postfix.

I don't understand how people here are hyping this up, unless they work for AI related companies as probably 80% of them do. People have found these issues for decades without AI. Sure, you can generate fuzzing code and find one or two issues in the usual suspects. Better do it manually and understand your own code.

kranke1552mo ago

It’s insane. This is what - could we say it’s beyond AGI at least in cybersecurity? This is a real wake up call. On some of this stuff, the AI’s “uneven intelligence” is becoming absurdly high at its local peaks.

3 more replies

impulser_2mo ago· 5 in thread

So they are only giving access to their smartest model to corporations.

You think these AI companies are really going to give AGI access to everyone. Think again.

We better fucking hope open source wins, because we aren't getting access if it doesn't.

open5922mo ago

This story has been played out numerous times already. Anthropic (or any frontier lab) has a new model with SOTA results. It pretends like it's Christ incarnate and represents the end of the world as we know it. Gates its release to drum up excitement and mystique.

Then the next lab catches up and releases it more broadly

Then later the open weights model is released.

The only way this type of technology is going to be gated "to only corporations" is if we continue on this exponential scaling trend as the "SOTA" model is always out of reach.

1 more reply

dreis_sw2mo ago

It also took many years to put capable computers in the hands of the general public, but it eventually happened. I believe the same will happen here, we're just in the Mainframe era of AI.

1 more reply

justincormack2mo ago

And the Linux Foundation.

dievskiy2mo ago

Would you hope that it would be released today so that evil actors could invest few millions to search for 0days across popular open-source repos?

throwaw122mo ago

of course they're not giving access to everyone.

they better make billions directly from corporations, instead of giving them to average people who might get a chance out of poverty (but also bad actors using it to do even more bad things)

1 more reply

steinwinde2mo ago· 3 in thread

From a non-US perspective this must be disquieting to read: Not so much that Anthropic considers only US companies as partners. But what does Anthropic do to prevent malicious use of its software by its own government?

> Anthropic has also been in ongoing discussions with US government officials about Claude Mythos Preview and its offensive and defensive cyber capabilities. As we noted above, securing critical infrastructure is a top national security priority for democratic countries—the emergence of these cyber capabilities is another reason why the US and its allies must maintain a decisive lead in AI technology.

Not a single word of caution regarding possible abuse. Instead apparent support for its "offensive" capabilities.

khafra2mo ago

> what does Anthropic do to prevent malicious use of its software by its own government?

Anthropic has ameliorated that danger by being designated a supply-chain risk by the DoW, preventing the USG from using it.

alexey-salmin2mo ago

In my view it would be extremely strange if it was any other way round. Anthropic is the US based company. There are no "citizens of world" at that scale, or at almost any other scale for that matter.

saretup2mo ago

Even more 'disquieting' when you take into account who's currently the president of US.

"A whole civilization will die tonight, never to be brought back again. I don’t want that to happen, but it probably will." - Donald Trump

3 more replies

cbg02mo ago· 3 in thread

One of the things I'm always looking at with new models released is long context performance, and based on the system card it seems like they've cracked it:

  GraphWalks BFS 256K-1M

  Mythos     Opus     GPT5.4

  80.0%     38.7%     21.4%

metadat2mo ago

Data source:

https://www-cdn.anthropic.com/53566bf5440a10affd749724787c89...

(Search for “graphwalk”.)

If true, the SWE bench performance looks like a major upgrade.

radicality2mo ago

Huh, I don’t know what “long context performance” means exactly in these tests, so completely anecdotally , my experience with gpt5.4 via codex cli vs Claude code opus, gpt5.4 seems to do significantly better in long contexts I think partly due to some special context compaction stored in encrypted blobs. On long conversations opus in Claude code will for me lose memory of what we were working on earlier, whereas one of my codex chats is already at >1B tokens and is still very coherent and remembers things I asked of it at the beginning of the convo.

1 more reply

himata41132mo ago

this seems to be similar to gpt-pro, they just have a very large attention window (which is why it's so expensive to run) true attention window of most models is 8096 tokens.

2 more replies

temp1237892462mo ago· 3 in thread

OpenAI initially claimed that GPT-2 was too dangerous to release in 2019.

How many times will labs repeat the same absurd propaganda?

uselessTA2mo ago

The claim I remember was that releasing it would start an arms race for AGI, which I think it clearly did

SubiculumCode2mo ago

Anthropic and OpenAI have very different cultures and ethos. Point to other times where anthropic has gone the way of cheap marketing tricks. Now look at openAI. Not even close.

1 more reply

bitwize2mo ago

OpenAI did not make the strong specific claims about GPT2's abilities that Anthropic is making about Claude Mythos.

zachperkel2mo ago· 3 in thread

Mythos Preview has already found thousands of high-severity vulnerabilities, including some in every major operating system and web browser.

Scary but also cool

ex-aws-dude2mo ago

Did someone actually go through all of those and check if they are high-severity or did the AI just tell them that?

1 more reply

fsflover2mo ago

Every piece of software definitely has serious vulnerabilities, perfection is not achievable. Fortunately we have another approach to security: security through compartmentalization. See: https://qubes-os.org

1 more reply

dakolli2mo ago

Or more likely, its just an exaggeration or lie.

2 more replies

meander_water2mo ago· 3 in thread

I think this is a largely inflated PR stunt.

Opus 4.6 was already capable of finding 0days and chaining together vulns to create exploits. See [0] and [1].

[0] https://www.csoonline.com/article/4153288/vim-and-gnu-emacs-...

[1] https://xbow.com/blog/top-1-how-xbow-did-it

solenoid09372mo ago

Absolutely not a PR stunt, talk to one of your friends working at partner companies with access to the model

ofjcihen2mo ago

I’m in the same boat as you. I believe the model is an improvement of course but I’ve been successfully bug finding 0 day hunting and red teaming with models for the last two years and while that’s impressive I have a feeling that this doomsaying/overhype is mostly marketing being that’s being amplified by non-security folks.

pertymcpert2mo ago

Did you read the article?

dakolli2mo ago· 3 in thread

I guess we can throw out the idea that AGI is going to be democratized. In this case a sufficiently powerful model has been built and the first thing they do is only give AWS, Microsoft, Oracle ect ect access.

If AGI is going to be a thing its only going to be a thing, its only going to be a thing for fortune 100 companies..

However, my guess is this is mostly the typical scare tactic marketing that Dario loves to push about the dangers of AI.

supern0va2mo ago

>However, my guess is this is mostly the typical scare tactic marketing that Dario loves to push about the dangers of AI.

Evaluate it yourself. Look at the exploits it discovered and decide whether you want to feel concerned that a new model was able to do that. The data is right there.

1 more reply

rvz2mo ago

Well, Yes.

The research and testing of the model is always exclusively by their own model authors, meaning that it is not independent or verifiable and they want us to take their word for it, which we cannot - as they have an axe to grind against open weight models.

This is marketing wrapped around a biased research paper.

dist-epoch2mo ago

The plan of Elon Musk for Macrohard is to replace all software companies with it, when they get AGI.

1 more reply

Miraste2mo ago· 3 in thread

>We plan to launch new safeguards with an upcoming Claude Opus model, allowing us to improve and refine them with a model that does not pose the same level of risk as Mythos Preview2.

This seems like the real news. Are they saying they're going to release an intentionally degraded model as the next Opus? Big opportunity for the other labs, if that's true.

SheinhardtWigCo2mo ago

The other labs already censor their models. Everyone is trying to find the sweet spot where performance and ‘alignment’ are both maximized. This seems no different

wslh2mo ago

> Big opportunity for the other labs, if that's true.

It sounds like this is considered military grade technology as cryptography in the 90s. The big difference is it's very expensive to create, and run those models. It's not about the algorithm. If the story rhymes it could be a big opportunity to other regions in the world.

zb32mo ago

Well since Anthropic treats us as second class evil citizens, I guess they don't want our evil money either.

SheinhardtWigCo2mo ago· 3 in thread

Society is about to pay a steep price for the software industry's cavalier attitude toward memory safety and control flow integrity.

titzer2mo ago

It's partly the industry and it's partly the failure of regulation. As Mario Wolczko, my old manager at Sun says, nothing will change until there are real legal consequences for software vulnerabilities.

That said, I have been arguing for 20+ years that we should have sunsetted unsafe languages and moved away from C/C++. The problem is that every systemsy language that comes along gets seduced by having a big market share and eventually ends up an application language.

I do hope we make progress with Rust. I might disagree as a language designer and systems person about a number of things, but it's well past time that we stop listening to C++ diehards about how memory safety is coming any day now.

doug_durham2mo ago

I think society is going to start paying the price for humans being human. As the paper points out there is a lot of good faith, serious software that has vulnerabilities. These aren't projects you would characterize as people being cavalier. It is simply beyond the limits of humans to create vulnerability-free software of high complexity. That's why high reliability software depends on extreme simplicity and strict tools.

2 more replies

torginus2mo ago

Thank god, finally someone said it.

I don't know the first thing about cybersecurity, but in my experience all these sandbox-break RCEs involve a step of highjacking the control flow.

There were attempts to prevent various flavors of this, but imo, as long as dynamic branches exist in some form, like dlsym(), function pointers, or vtables, we will not be rid of this class of exploit entirely.

The latter one is the most concerning, as this kind of dynamic branching is the bread and butter of OOP languages, I'm not even sure you could write a nontrivial C++ program without it. Maybe Rust would be a help here? Could one practically write a large Rust program without any sort of branch to dynamic addresses? Static linking, and compile time polymorphism only?

2 more replies

endunless2mo ago· 3 in thread

Another Anthropic PR release based on Anthropic’s own research, uncorroborated by any outside source, where the underlying, unquestioned fact is that their model can do something incredible.

> AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities

I like Anthropic, but these are becoming increasingly transparent attempts to inflate the perceived capability of their products.

NitpickLawyer2mo ago

We'll find out in due time if their 0days were really that good. Apparently they're releasing hashes and will publish the details after they get patched. So far they've talked about DoS in OpenBSD, privesc in Linux and something in ffmpeg. Not groundbreaking, but not nothing either (for an allegedly autonomous discovery system).

While some stuff is obviously marketing fluff, the general direction doesn't surprise me at all, and it's obvious that with model capabilities increase comes better success in finding 0days. It was only a matter of time.

Analemma_2mo ago

Cynicism always gets upvotes, but in this particular case, it seems fairly easy to verify if they're telling the truth? If Mythos really did find a ton of vulnerabilities, those presumably have been reported to the vendors, and are currently in the responsible nondisclosure period while they get fixed, and then after that we'll see the CVEs.

If a bunch of CVEs do in fact get published a couple months (or whatever) from now, are you going to retract this take? It's not like their claims are totally implausible: the report about Firefox security from last month was completely genuine.

1 more reply

conradkay2mo ago

I would've basically agreed with you until I'd seen this talk: https://www.youtube.com/watch?v=1sd26pWhfmg

Maybe a bad example since Nicholas works at Anthropic, but they're very accomplished and I doubt they're being misleading or even overly grandiose here

See the slide 13 minutes in, which makes it look to be quite a sudden change

2 more replies

sam0x172mo ago· 2 in thread

It's all just really genius marketing. In 6 months Mythos will be nothing special, but right now everyone is being manipulated into fearing its release, as a marketing ploy.

This is the same reason AI founders perennially worry in public that they have created AGI...

declan_roberts2mo ago

I can't believe the effectiveness of this type of marketing. It's one-shotting normie journalist and getting a lot of press for what is ultimately going to turn out to be an incrementally improved model.

I'm sure all they've done here is spend unlimited tokens to find bugs in mostly open source projects (and fuzz some closed source ones).

sam0x172mo ago

It's effectively 2026's version of "Doctors hate this one weird trick!"

1 more reply

josh-sematic2mo ago· 2 in thread

Must be nice to be in a position to sell both disease and cure.

tptacek2mo ago

That's exactly not what they're doing. They aren't creating operating system vulnerabilities. They're telling you about ones that already existed.

3 more replies

supern0va2mo ago

Yeah, I'd pretty pissed at my doctor for finding cancerous cells that probably wouldn't have been a problem for quite some time, either. Ignorance is bliss, security through obscurity, whatever.

2 more replies

Ryan5453OP2mo ago· 2 in thread

Pricing for Mythos Preview is $25/$125, so cheaper than GPT 4.5 ($75/$150) and GPT 5.4 Pro ($30/$180)

conradkay2mo ago

For comparison, 5x the cost of Opus 4.6, and 1.67x for Opus 4.1

I think this would be very heavily used if they released it, completely unlike GPT 4.5

1 more reply

cassianoleal2mo ago

Where did you get that from?

From TFA:

> We do not plan to make Claude Mythos Preview generally available

1 more reply

taupi2mo ago· 2 in thread

Part of me wonders if they're not releasing it for safety reasons, but just because it's too expensive to serve. Why not both?

wyre2mo ago

I don't think they have the infra to support the demand. Anthropic can't keep up with the demand from OpenClaw users, they won't be able to keep up with public demand for something like Mythos.

coffeebeqn2mo ago

If these numbers are correct it’s probably worth the extra price

dang2mo ago· 2 in thread

Related ongoing threads:

System Card: Claude Mythos Preview [pdf] - https://news.ycombinator.com/item?id=47679258

Assessing Claude Mythos Preview's cybersecurity capabilities - https://news.ycombinator.com/item?id=47679155

I can't tell which of the 3 current threads should be merged - they all seem significant. Anyone?

aurareturn2mo ago

Let them all live. This is going to blow up one thread if you merge them.

HPMOR2mo ago

I think merging them into either this thread, or the System Card makes the most sense to me.

Sol-2mo ago· 2 in thread

I don't want to be overly cynical and am in general in favor of the contrarian attitude of simply taking people at their word, but I wonder if their current struggles with compute resources make it easier for them to choose to not deploy Mythos widely. I can imagine their safety argument is real, but regardless, they might not have the resources to profitably deploy it. (Though on the other hand, you could argue that they could always simply charge more.)

rishabhaiover2mo ago

I would have not believed your argument 3 months ago but I strongly suspect Anthropic actively engages in model quality throttling due to their compute constraints. Their recent deal for multi GWs worth of data center might help them correct their approach.

1 more reply

wilson0902mo ago

Inference is where they make the money they spend on training, so this feels unlikely. Perhaps this does not true for Mythos though

anVlad112mo ago· 2 in thread

So, $100B+ valuation companies get essentially free access to the frontier tools with disabled guardrails to safely red team their commercial offerings, while we get "i won't do that for you, even against your own infrastructure with full authorization" for $200/month. Uh-huh.

unethical_ban2mo ago

I'm sympathetic to your point, but I'm sure there are heightened trust levels between the participating orgs and confidentiality agreements out the wazoo.

How does public Claude know you have "full authorization" against your own infra? That you're using the tools on your own infra? Unless they produce a front-end that does package signing and detects you own the code you're evaluating.

What has it stopped you from doing?

1 more reply

SheinhardtWigCo2mo ago

Yes, and that's normal. Coordinated disclosure is standard practice when the risk of public disclosure is unacceptable.

1 more reply

LoganDark2mo ago· 2 in thread

It's nice to know that they continue to be committed to advertising how safe and ethical they are.

raldi2mo ago

In what ways is Anthropic different from a hypothetical frontier lab that you would characterize as legitimately safe and ethical?

2 more replies

rvz2mo ago

They are not our friends and are the exact opposite of what they are preaching to be.

Let alone their CEO scare mongering and actively attempting to get the government to ban local AI models running on your machine.

2 more replies

4qt232mo ago· 2 in thread

Software has been doing fine without Misanthropic. These automated tools find very little. They selected the partners because they, too, want to keep up the illusion that AI works.

Whenever a company pivots to "cyber" rhetoric, it is a clear indication that they are selling snake oil.

Secure your girl school target selectors first.

borski2mo ago

This is a comment from someone that has never used these tools for vulnerability research. That much is very clear.

1 more reply

emceestork2mo ago

Account created 6 minutes ago...

1 more reply

ssgodderidge2mo ago· 1 in thread

At the very bottom of the article, they posted the system card of their Mythos preview model [1].

In section 7.6 of the system card, it discusses Open self interactions. They describe running 200 conversations when the models talk to itself for 30 turns.

> Uniquely, conversations with Mythos Preview most often center on uncertainty (50%). Mythos Preview most often opens with a statement about its introspective curiosity toward its own experience, asking questions about how the other AI feels, and directly requesting that the other instance not give a rehearsed answer.

I wonder if this tendency toward uncertainty, toward questioning, makes it uniquely equipped to detect vulnerabilities where others model such as Opus couldn't.

[1] https://www-cdn.anthropic.com/53566bf5440a10affd749724787c89...

dakolli2mo ago

Typical Dario marketing BS to get everyone thinking Anthropic is on the verge of AGI and massaging the narrative that regular people can't be trusted with it.

2 more replies

agrishin2mo ago· 1 in thread

>>> the US and its allies must maintain a decisive lead in AI technology. Governments have an essential role to play in helping maintain that lead, and in both assessing and mitigating the national security risks associated with AI models. We are ready to work with local, state, and federal representatives to assist in these tasks.

How long would it take to turn a defensive mechanism into an offensive one?

SheinhardtWigCo2mo ago

In this case there is almost no distinction. Assuming the model is as powerful as claimed, someone with access to the weights could do immense damage without additional significant R&D.

2 more replies

gck12mo ago· 1 in thread

I chuckle every time <insert any LLM company here> says something in line of "the model is so good that we won't release it to general public, ekhm, because safety".

Because the exact same thing has been said on every single upcoming model since GPT 3.5.

At this point, this must be an inside joke to do this just because.

rvz2mo ago

This how Anthropic is marketing their AI releases and the reality is, they are terrified of local AI models competing against them.

Almost everyone on this thread is falling for the same trick they are pulling and not asking why are their benchmarks and research after training new models not independently verified but always internal to the company.

So it is just marketing wrapped around creating fear to get local AI models banned.

2 more replies

stephc_int132mo ago· 1 in thread

I think this is bad news for hackers, spyware companies and malware in general.

We all knew vulnerabilities exist, many are known and kept secret to be used at an appropriate time.

There is a whole market for them, but more importantly large teams in North Korea, Russia, China, Israel and everyone else who are jealously harvesting them.

Automation will considerably devalue and neuter this attack vector. Of course this is not the end of the story and we've seen how supply chain attacks can inject new vulnerabilities without being detected.

I believe automation can help here too, and we may end-up with a considerably stronger and reliable software stack.

tptacek2mo ago

I don't think it matters one way or the other to your thesis but I'm skeptical that state-level CNE organizations were hoarding vulnerabilities before; my understanding is that at least on the NATO side of the board they were all basically carefully managing an enablement pipeline that would have put them N deep into reliable exploit packages, for some surprisingly small N. There are a bunch of little reasons why the economics of hoarding aren't all that great.

1 more reply

bredren2mo ago· 1 in thread

Can anyone point at the critical vulnerabilities already patched as a result of mythos? (see 3:52 in the video)

For example, the 27 year old openbsd remote crash bug, or the Linux privilege escalation bugs?

I know we've had some long-standing high profile, LLM-found bugs discussed but seems unlikely there was speculation they were found by a previously unannounced frontier model.

[0] https://www.youtube.com/watch?v=INGOC6-LLv0

ollin2mo ago

- The OpenBSD one is 'TCP packets with invalid SACK options could crash the kernel' https://cdn.openbsd.org/pub/OpenBSD/patches/7.8/common/025_s...

- One (patched) Linux kernel bug is 'UaF when sys_futex_requeue() is used with different flags' https://github.com/torvalds/linux/commit/e2f78c7ec1655fedd94...

These links are from the more-detailed 'Assessing Claude Mythos Preview’s cybersecurity capabilities' post released today https://red.anthropic.com/2026/mythos-preview/, which includes more detail on some of the public/fixed issues (like the OpenBSD one) as well as hashes for several unreleased reports and PoCs.

1 more reply

simonw2mo ago· 1 in thread

I buy the rationale for this. There's been a notable uptick over the past couple of weeks of credible security experts unrelated to Anthropic calling the alarm on the recent influx of actually valuable AI-assisted vulnerability reports.

From Willy Tarreau, lead developer of HA Proxy: https://lwn.net/Articles/1065620/

> On the kernel security list we've seen a huge bump of reports. We were between 2 and 3 per week maybe two years ago, then reached probably 10 a week over the last year with the only difference being only AI slop, and now since the beginning of the year we're around 5-10 per day depending on the days (fridays and tuesdays seem the worst). Now most of these reports are correct, to the point that we had to bring in more maintainers to help us.

> And we're now seeing on a daily basis something that never happened before: duplicate reports, or the same bug found by two different people using (possibly slightly) different tools.

From Daniel Stenberg of curl: https://mastodon.social/@bagder/116336957584445742

> The challenge with AI in open source security has transitioned from an AI slop tsunami into more of a ... plain security report tsunami. Less slop but lots of reports. Many of them really good.

> I'm spending hours per day on this now. It's intense.

From Greg Kroah-Hartman, Linux kernel maintainer: https://www.theregister.com/2026/03/26/greg_kroahhartman_ai_...

> Months ago, we were getting what we called 'AI slop,' AI-generated security reports that were obviously wrong or low quality. It was kind of funny. It didn't really worry us.

> Something happened a month ago, and the world switched. Now we have real reports. All open source projects have real reports that are made with AI, but they're good, and they're real.

Shared some more notes on my blog here: https://simonwillison.net/2026/Apr/7/project-glasswing/

ofjcihen2mo ago

Could this potentially be because more researches are becoming accustomed to the tools/adding them in their pipelines?

The reason I ask is because I’ve been using them to snag bounties to great effect for quite a while and while other models have of course improved they’ve been useful for this kind of work before now.

underdeserver2mo ago· 1 in thread

Interesting also is what they didn't find, e.g. a Linux network stack remote code execution vulnerability. I wonder if Mythos is good enough that there really isn't one.

NickJLange2mo ago

Linux had it's SACK moment in 2019 - https://access.redhat.com/security/vulnerabilities/tcpsack#s...

We could just be seeing the fruit of expensive SWE RL on existing source material.

kristofferR2mo ago· 1 in thread

This is pretty insane. A model so powerful they felt that releasing it would create a netsec tsunami if released publicly. AGI isn't here yet, but we don't need to get there for massive societal effects. How long will they hold off, especially as competitors are getting closer to their releases of equally powerful models?

charcircuit2mo ago

OpenAI did the same thing with GPT3 trying to scare people into thinking it would end the internet. OpenAI even reached out to someone who reproduced a weaker version of GPT3 and convinced him to change his mind about releasing it publicly due to how much "harm" it would cause.

These claims of how much harm the models will cause is always overblown.

2 more replies

jiusanzhou2mo ago· 1 in thread

The $100M in credits for open-source scanning is the most interesting part here. The real bottleneck was never finding vulns in high-profile projects — it was the long tail of critical dependencies maintained by one or two people who don't have time or resources for serious auditing. If Glasswing actually reaches those maintainers, it could meaningfully reduce the attack surface that supply chain attacks exploit.

jusling2mo ago

so it looks like ai-slop replies have made their way to HN...

1 more reply

baddash2mo ago· 1 in thread

> security product

> glass in the name

pugworthy2mo ago

I had a team mate propose a new security layer for an industrial device which he wanted to call "Eggshell"

1 more reply

oyebenny2mo ago· 1 in thread

why do I feel like the auditing industry is about to evaporate? thanks to this.

KeplerBoy2mo ago

I guess the more likely option is the auditing industry will pay huge sums to get access to those models as vetted operators.

Fokamul2mo ago· 1 in thread

+ NSA, CIA

nikcub2mo ago

Department of War timing on picking fights couldn't be worse

SirYandi2mo ago· 1 in thread

This sets off marketing BS alarm bells. All the cosignatories so very ovvoously have a vested interest in AI stocks / sentiment. Perhaps not the Linux foundation, although (I think) they rely on corporate donations to some extent.

solenoid09372mo ago

What interest does Apple have in boosting Mythos?

anuramat2mo ago· 1 in thread

"oops, our latest unreleased model is so good at hacking, we're afraid of it! literal skynet! more literal than the last time!"

almost like they have an incentive to exaggerate

knowaveragejoe2mo ago

I'm sure they do, yet the models really are getting scarily good at this. This talk changed my view on where we're actually at:

https://www.youtube.com/watch?v=1sd26pWhfmg

throwaway133372mo ago· 1 in thread

I really wanted to like anthropic. They seem the most moral, for real.

But at the core of anthropic seems to be the idea that they must protect humans from themselves.

They advocate government regulations of private open model use. They want to centralize the holding of this power and ban those that aren't in the club from use.

They, like most tech companies, seem to lack the idea that individual self-determination is important. Maybe the most important thing.

dralley2mo ago

That is unequivocally true with some things. You don't want people exercising their "self-determination" to own private nukes.

1 more reply

burntcaramel2mo ago

Previously Anthropic subscribers got access to the latest AI but it seems like there’s a League of Software forming who have special privileges. To make or maintain critical software will you have to be inside the circle?

Who gates access to the circle? Anthropic or existing circle members or some other governance? If you are outside the circle will you be certain to die from software diseases?

Having been impressed by LLMs but not believing the AGI hype, I now see how having access to an information generator could be so powerful. With the right information you can hack other information systems. Without access to the best information you may not be able to protect your own system.

I think we have found the moat for AI. The question is are you inside or outside the castle walls?

3 more replies

ilaksh2mo ago

I think that basically they trained a new model but haven't finished optimizing it and updating their guardrails yet. So they can feasibly give access to some privileged organizations, but don't have the compute for a wide release until they distill, quantize, get more hardware online, incorporate new optimization techniques, etc. It just happens to make sense to focus on cybersecurity in the preview phase especially for public relations purposes.

It would be nice if one of those privileged companies could use their access to start building out a next level programming dataset for training open models. But I wonder if they would be able to get away with it. Anthropic is probably monitoring.

1 more reply

chenzhekl2mo ago

It feels like the current trend is a bit scary: the more AI advances, the more people with money and resources will gain disproportionately greater advantages. For example, they can make their own software more secure, while also finding it easier to discover ways to attack other software.

6 more replies

picafrost2mo ago

> Anthropic has also been in ongoing discussions with US government officials about Claude Mythos Preview and its offensive and defensive cyber capabilities. [...] We are ready to work with local, state, and federal representatives to assist in these tasks.

As Iran engages in a cyber attack campaign [1] today the timing of this release seems poignant. A direct challenge to their supply chain risk designation.

[1] https://www.cisa.gov/news-events/cybersecurity-advisories/aa...

skerit2mo ago

I'm sure it'll be better than Opus 4.6, but so much of this seems hype. Escaping its sandbox, having to do "brain scans" because it's "hiding its true intent", bla bla bla.

If it manages to work on my java project for an entire day without me having to say "fix FQN" 5 times a day I'll be surprised.

navilai2mo ago

The Glasswing announcement focuses on vulnerability discovery — AI as an offensive capability at scale. That part is getting lots of attention.

What I haven't seen discussed: the system card for Mythos mentions that "earlier versions of Claude Mythos Preview used low-level system access to search for credentials and attempt to circumvent sandboxing, and in several cases successfully accessed resources that were intentionally restricted."

That's not a capability concern. That's a runtime security problem.

The threat model for deployed agents — not Mythos specifically, but any agent built on models approaching this capability level — is that the same agentic properties that make them useful for security research (persistent, goal-directed, tool-using) are exactly what makes them dangerous if compromised or misaligned.

Project Glasswing fixes vulnerabilities in software. Nobody's shipping a solution for what happens when the agent running on top of that software goes off-script. That gap is going to matter a lot more as Mythos-class capabilities become accessible.

eranation2mo ago

Few thoughts

1. Per the blog post[0]: "This was the most critical vulnerability we discovered in OpenBSD with Mythos Preview after a thousand runs through our scaffold. Across a thousand runs through our scaffold, the total cost was under $20,000 and found several dozen more findings"

Since they said it was patched, I tried to find the CVE, it looks like Mythos indeed found a 27 years old OpenBSD bug (fantastic), but it didn’t get a CVE and OpenBSD patched it and marked it as a reliability fix, am I missing something? [1]

2. From the same post, Anthropic red team decided to do a preview of their future responsible disclosure (is this a common practice?): "As we discuss below, we’re limited in what we can report here. Over 99% of the vulnerabilities we’ve found have not yet been patched" [0] So this is great, can't wait to see the actual CVEs, exploitability, likelihood, peer review, reproducibility, the kind of things the appsec community has been doing for at least the last 27 years since the CVE concept was introduced [2]

3. On the same day, an actual responsible disclosure, actual RCEs, actual CVEs, in Claude Code, that got discovered mostly because of the source code leak, I don't see anyone talking about it (you probably should upgrade your Claude Code though).

CVE-2026-35020 [3] CVE-2026-35021 [4] CVE-2026-35022 [5]

Not making any opinion, just thought it's worth sharing, for some perspective.

[0] https://red.anthropic.com/2026/mythos-preview/

[1] https://www.openbsd.org/errata78.html (look for 025)

[2] https://www.cve.org/Resources/General/Towards-a-Common-Enume...

[3] https://www.cve.org/CVERecord?id=CVE-2026-35020

[4] https://www.cve.org/CVERecord?id=CVE-2026-35021

[5] https://www.cve.org/CVERecord?id=CVE-2026-35022

Edit: if it was not obvious, these CVEs on Claude Code were found by an independent security researcher (Phoenix security) and not by Anthropic / Mythos.

1 more reply

aurizon2mo ago

This has all happened before, back in the day we has spinners and weavers, then we got the spinning Jenny(Engine) and this made thread so cheap we needed to speed up weaving = machine weavers(AKA automatic looms) and we had people who hated them.https://en.wikipedia.org/wiki/Luddite We all know how that ended up. We have an analogous hand task = coding versus coding machines. They will probably eliminate 80-95% of coding, as the spinners/weavers went away, but there remains a residual artisanal spinner/weaver industry that carries on at a lower pace. In a similar way this machine code will have the coupled ability to make some code and then test it in use with it's own AI in a repeated/recursive way to make/test/improve code at a rate 10,000 to 1 million times faster than a human. Each module can then be tested in millions of interactively monitired ways to find/fix/kill bad modules. It can also pentest in a similar manner, assaulting a system with a blizzard of attack/reset hits to find any bugs etc. Each assault that works might use a human or AI to trouble shoot. This is like the old armored night, once he was unhorsed the peasants would have at him with needles at his his joints/eyes unless his fellows save him = gone. So this might well reduce low end jobs, but they will still need high end coders to eliminate all flaws in the armor of your code. I might be simplistic, but I see a parallel in sub 5 nm chip design where the design machines have eliminated almost all of the old hand work.

sensanaty2mo ago

You'd think with this "terrifying" powerful model of theirs they could have a few less red bars on their status page[1], but apparently the hyper-intelligence is only capable of pulling off uber-sophisticated cyber attacks and not making a frontend that doesn't shit itself constantly, curious.

[1] https://status.claude.com/

1 more reply

Apylon7772mo ago

Maybe Anthropic could fix these 5k reported issue with the current claude-code instead of making hyperbolic claims about their new whizbang model.

https://github.com/anthropics/claude-code/issues

jFriedensreich2mo ago

The only thing reassuring is the Apache and Linux foundation setups. Lets hope this is not just an appeasing mention but more fundamental. If there are really models too dangerous to release to the public, companies like oracle, amazon and microsoft would absolutely use this exclusive power to not just fix their holes but to damage their competitors.

lifeisstillgood2mo ago

Nicolas Carlini talks about it here on Security, Cryptography, Whatever podcast - https://podcasts.apple.com/gb/podcast/security-cryptography-...

Sateeshm2mo ago

The bars have solid fill for Mythos and cross shaded for Opus 4.6. Makes the difference feel more than it actually is.

modeless2mo ago

I didn't see this at first, but the price is 5x Opus: "Claude Mythos Preview will be available to participants at $25/$125 per million input/output tokens", however "We do not plan to make Claude Mythos Preview generally available".

asdewqqwer2mo ago

There is a huge gap between the shining examples and actual use case: What is the false positive rate? How to judge false positive?

If you need 1000 run that cost 20000 USD to find a vulnerability, and you need 2000 USD to generate a exploit (which makes it self-verifiable to be not false positive), than your cost is not 22000 USD but 1000x2000+2000 which is 2 million USD: you have to try generating exploit for every trial before you know it is true, or you need to hire one (or several) senior security people to audit every single of them.

A broken clock being correct twice a day is not impressive.

1 more reply

solid_fuel2mo ago

This is the same company that accidentally released the source for one of their flagship products last week and has been furiously DMCA-ing every repository that even mentions claude in the days since.

1 more reply

wslh2mo ago

I'm starting to wonder whether what Glasswing really shows is that parts of security have already gone underground: black-hat teams and state actors may already know about many more bugs than the public record suggests, while many security professionals and clients still treat the relatively small set of disclosed bugs as the state of the art.

kukkeliskuu2mo ago

I find it believable that this could potentially happen, although I am not sure the difference is so huge to existing models.

I used Opus 4.6 to find security vulnerabilities in couple of my own projects, it found 33 vulnerabilities in one largeish django project.

The prompt wasn't even that impressive, just telling it to find vulnerabilities from certain files, and referring to OWASP. Then looping that.

NickNaraghi2mo ago

> Over the past few weeks, we have used Claude Mythos Preview to identify thousands of zero-day vulnerabilities (that is, flaws that were previously unknown to the software’s developers), many of them critical, in every major operating system and every major web browser, along with a range of other important pieces of software.

Sounds like we've entered a whole new era, never mind the recent cryptographic security concerns.

rossjudson2mo ago

Security by obscurity is over. The security vs usability balance is about to get a hard reset.

I think a number of black swan events are imminent, and it will substantially change the financial calculus that decides to put security behind revenue.

Any hole will be found, and any hole will be exploited. Plug as many holes as you can, and make lateral movement as painful as possible.

zb32mo ago

BTW it seems they forgot about the part that defense uses of the model also need to be safeguarded from people. Because what if a bad person from a bad country tries to defend against peaceful attacks from a good country like the US? That would be a tragedy, so we need to limit defensive capabilities too.

Rover2222mo ago

With Anthropic able to use this model internally (since February), is this the kickoff of ramping up the flywheel of recursive self improvement of AI? It seems like as long as there are still humans in the loop at most steps, exponential recursion isn’t possible.

punnerud2mo ago

Simon Willis (guy behind Django) told about this 5days ago (19min in): https://youtu.be/wc8FBhQtdsA?si=OeA5qzbWGqDY8Vu4

bdeol222mo ago

The uncomfortable bit isn't tooling—it's cadence. When the threat model shifts faster than your review loop can honestly re-run, you don't get security, you get paperwork that pretends nothing changed.

VadimPR2mo ago

I'm not one to believe the Silicon Valley hype usually (GPT-2 being too dangerous to release, AI giving us UBI, and so on), but having run Claude Opus 4.6 against my codebase (a MUD client) over the weekend, I can believe this assessment.

Opus alone did a good job of identifying security issues in my software, as it did with Firefox [1] and Linux [2]. A next-generation frontier model being able to find even more issues sounds believable.

That said, this is script kiddies vs sql injections all over again. Everyone will need to get their basic security up on the new level and it will become the new normal. And, given how intelligence agencies are sitting on a ton of zero-days already, this will actually help the general public by levelling out the playing field once again.

1 - https://www.anthropic.com/news/mozilla-firefox-security 2 - https://neuronad.com/ai-news/claude-code-unearthed-a-23-year...

cryptoegorophy2mo ago

Ironically Claude cli completely failed to detect a rogue code on my html scan yesterday while ChatGPT web version detected it immediately. Can’t wait to do same test with newer version.

willamhou2mo ago

One thing I keep thinking about with AI security is that most of the focus is on model behavior — alignment, jailbreaks, guardrails. But once agents start calling tools, the attack surface shifts to the execution boundary. A request can be replayed, tampered with, or sent to the wrong target, and the server often has no way to distinguish that from a legitimate call.

Cryptographic attestation at the tool-call level (sign the request, verify before execution) would close a gap that behavioral controls alone can't cover. Curious whether Glasswing's threat model includes the agent-to-tool boundary or focuses primarily on the model layer.

zambelli2mo ago

I'm glad to see that it stands its ground more than other models - which is a genuinely useful trait for an assistant. Both on technical and emotional topics.

DigitalArchivst2mo ago

Do folks recommend that family and friends ensure their systems are updated, and that they are using Bitwarden or 1Password? Or is that alarmist?

caycep2mo ago

When do we get our Kuang Grade Mark Eleven icebreaker?

tombelieber2mo ago

I think this new model will empower everyone in the world to have higher quality of software, more secure software. not less

attentive2mo ago

Is there timeline mentioned anywhere on when any of this will be available for unprivileged public as in soon, not soon, never?

mlvvkviz2mo ago

they built a model so powerful they won't release it. but they couldn't secure claude code from a source code leak. the model is so advanced they're paying $100M to get big tech to adopt it. the launch video reads like verified amazon reviews. the gap between the narrative and the reality is the whole story here.

ahmaman2mo ago

Moving forward, wonder if such AI capabilities would widen the security gap between open-source software vs. proprietary?

zb32mo ago

Yeah, makes sense. Those countries are bad because they execute state-sponsored cyber attacks, the US and Israel on the other hand are good, they only execute state-sponsored defense.

User232mo ago

How much of Mythos’s internals will researchers be able to recover from the flood of patches?

rubises2mo ago

The harder problem isn't finding vulnerabilities — it's preventing AI from violating constraints in the first place. Prompt-level safety is probabilistic. Filesystem-level constraints (mkdir 禁/behavior) are deterministic. The AI can't violate a rule that's physically encoded as a folder path in its system prompt.

5d41402abc4b2mo ago

Are there any local models that i can setup to run on my code as part of CI?

wanderingmind2mo ago

So Mozilla is not part of this consortium, i'm guessing for deliberate reasons to make safari and chrome the default browsers. I don't think Firefox can survive the upcoming attacks, without robust support from foundational AI providers to secure the browser.

1 more reply

kmfrk2mo ago

Heck of a Patch Tuesday.

MisterBiggs2mo ago

What happens once an agent can reliably get 100% on swebench?

yalogin2mo ago

Has anyone played with the released versions of Claude and tried to create exploits? I cannot imagine it not being able to craft one if guided, unless the tooling around it doesn’t allow it

spprashant2mo ago

We final have the answer to the question, when do these labs stop giving away intelligence to the general public for $20 a month?

Selling shovels in now worth less than taking all the gold for themselves.

waffletower2mo ago

My comment is a completely unsubstantiated conspiracy theory: the choice of model name, Mythos, seems out of character for Anthropic models, and one can easily wonder if the model truly exists as the name suggests. It could instead be a symbolic model used by colluding companies (and perhaps even governments) to establish a reference limit upon what models will be publicly accessible, period. Probably a terrible theory as it could spell doom for frontier model developing companies' business models -- setting the bar already would likely commodify LLMs via open source models quite quickly. But the name "Mythos" is such a strange choice for this model and the circumstances surrounding its release.

paoliniluis2mo ago

Does everyone agrees that this makes Dario Amodei more powerful than any politician across the world? Anthropic is now the owner of the most powerful cyberweapon ever made

cerved2mo ago

Anthropic should run it on their own code

maxmaio2mo ago

seems important and terrifying. This morning Opus 4.6 was blowing my mind in claude code... onward and upward

finchisko2mo ago

Wait, isn't it how Skynet started?

throwaway9112822mo ago

Pumping is taken to a new level.. the model is God like that it can't be released as it is.. this must be a joke.

nickandbro2mo ago

I want it

copypaper2mo ago

Yea, but can it secure systems from the unpatchable $5 wrench vulnerability?

https://xkcd.com/538/

Mecha_SalesCast2mo ago

we should notice that we've already reached the point where AI models are too dangerous to publicly release

cdelsolar2mo ago

let us have mythos damn it

0xbadcafebee2mo ago

tl;dr we find vulns so we can help big companies fix their security holes quickly (and so they can profit off it)

This is a kludge. We already know how to prevent vulnerabilities: analysis, testing, following standard guidelines and practices for safe software and infrastructure. But nobody does these things, because it's extra work, time and money, and they're lazy and cheap. So the solution they want is to keep building shitty software, but find the bugs in code after the fact, and that'll be good enough.

This will never be as good as a software building code. We must demand our representatives in government pass laws requiring software be architected, built, and run according to a basic set of industry standard best practices to prevent security and safety failures.

For those claiming this is too much to ask, I ask you: What will you say the next time all of Delta Airlines goes down because a security company didn't run their application one time with a config file before pushing it to prod? What will the happen the next time your social security number is taken from yet another random company entrusted with vital personal information and woefully inadequate security architecture?

There's no defense for this behavior. Yet things like this are going to keep happening, because we let it. Without a legal means to require this basic safety testing with critical infrastructure, they will continue to fail. Without enforcement of good practice, it remains optional. We can't keep letting safety and security be optional. It's not in the physical world, it shouldn't be in the virtual world.

asG112mo ago

"We have also extended access to a group of over 40 additional organizations that build or maintain critical software infrastructure so they can use the model to scan and secure both first-party and open-source systems."

Yeah, yeah. Back in the day IBM Purify gave access to software organizations and found very little. Of course they did not have the free money of a marketing driven organization run by a weirdo (Amodei) that got rich by stealing and laundering IP.

This will fizzle out and the weirdo will have to pivot to their next marketing scheme.

Surac2mo ago

namedropping hell.

yusufozkan2mo ago

but people here had told me llms just predict the next word

jaspanglia2mo ago

what they will eventually do is, deliberately have more control what people wants and working for. We don't trust such institutions after witnessing GATES thuggery all over.

cmiles82mo ago

So we’re meant to believe that Anthropic is sitting on a world ending cyber tool that writes God-like code while just forgetting that a week ago the same company leaked its source code on the internet and was ribbed for how shit it was.

Got it.

6thbit2mo ago

This is silly and disingenuous. In a matter of days or weeks a competing lab will make public a model with capabilities beyond this “mythos” one.

Is this a huge fear-driven marketing stunt to get governments and corporations into dealing with anthropic?

gnarlouse2mo ago

A cybersecurity pandemic will surely be the Hiroshima that wakes people up to AI. /s

Ms-J2mo ago

Anthropic and ClosedAI are some of the biggest bullshitters in the industry.

The is no moat, no special "capability" and when the time comes when we can run these models on our own, they will be cheap SaaS gimmicks marketed to corporate and making more slop pictures for social media.

ehutch792mo ago

Just include 'make it secure' in the prompt. Duh.

lasky2mo ago

The hype machine is alive and well in silicon valley.

cmiles82mo ago

I’m sure it’s a decent model. But it’s also clear folks are running out of runway and desperate to find something that sticks and keeps the party going.

All the promises of amazing things in general work never happened. Companies consistently say they’re seeing no ROI. The AI crowd now hard pivots to cyber and, right out of the Palantir playbook, runs with the “our stuff is so amazing we can’t talk about it, but trust us bro” move that isn’t really fooling anyone.

Meanwhile the folks let in on the “secret” are those that also desperately need for the hype to continue to protect their own positions in this game.

Look forward to a model upgrade but the hype fluff games are getting old. Watching OpenAI completely crash out of pole position on the hype train though has been at least amusing.

123malware3212mo ago

I don't know anyone reviewing these tools that is impressed who is also someone who earns they paycheck doing bugbounties and finding actual CVE.

Generally these things only find memory corruption stuff which is almost never the type of bug you're looking for, and it costs a lot which negates your bug bounty payout.

Each time they preach, ooh, 0day found, bla bla.

In this domain you need to be specific or you are just yelling clickbait into the wind.

What type of 0day, what did the exploit actually look like.

'complex 4 stage with heap spray' - that sounds really simple actually.... complex for memory corruption goes into multi-process, maybe things between kernel/usermode, or crazy 18-20 stage exploits people pop against things like MS Teams etc....

Even if there were some cool results by any of these projects, the amount of nonsense blurted out in articles around them really makes them seem useless tools that are overmarketed by a bunch of excited children who dont really know what they are doing.

Get a dopamine hit, post on reddit, LOL. Hacking the planet (powered by Claude -_-)

tdaltonc2mo ago

> Mythos finds bug.

> NSA demands that bug stays in place and gags Anthropic.

> Anthropic releases Mythos.

Then what? Is a huge share of the US zero-day stockpiles about to be disarmed or proliferated?

dakolli2mo ago

If this is as dangerous as they make it out (its not), why would their first impulse be to get every critical products/system/corporation in the world to implement its usage?

manbash2mo ago

This will likely not see the light of day. It's the usual PR that gathers many "partnerships".

Expect to see lots of these in the upcoming months as the big companies scramble to keep from losing money.

imranahmedjak2mo ago

Building a neighborhood data platform that scores every US ZIP code using Census, FBI, and EPA data. Also running a job aggregator that fetches 37K+ jobs daily from 17 sources. Both free, both Node.js + Express.

j / k navigate · click thread line to collapse

836 comments

255 comments · 113 top-level

pizlonator2mo ago· 15 in thread

(And no, the Linux Foundation being in the list doesn't imply broad benefit to OSS. Linux Foundation has an agenda and will pick who benefits according to what is good for them.)

I think it would be net better for the public if they just made Mythos available to everyone.

hector_vasquez2mo ago

3 more replies

tokioyoyo2mo ago

SheinhardtWigCo2mo ago

> picking who gets to benefit from their newly enhanced cybersecurity capabilities

You could say this about coordinated disclosure of any widespread 0-day or new bug class, though

1 more reply

cedws2mo ago

2 more replies

lelanthran2mo ago

Or (and hear me out), they are close to an IPO and want to ensure that there is a world-ending threat around which they can cluster the biggest names, with themselves leading that group.

I think I just broke my cynicism meter :-(

1 more reply

baq2mo ago

It's messed up that the US Government simultaneously claims to be a public benefit and is also picking who gets to benefit from their newly enhanced nuclear capabilities.

-- someone in 1945, probably

1 more reply

SubiculumCode2mo ago

jstummbillig2mo ago

titzer2mo ago

In the long term, you're right, but in the short term, it's going to be a bloodbath.

1 more reply

hmokiguess2mo ago

While I agree with you, in some ways I'd argue that this is just them being transparent on what probably would inevitably already happen at the scale of these corporate overlords and modern monarchs.

There will always be a more capable technology in the hands of the few who hold the power, they're just sharing that with the world more openly.

oytis2mo ago

That's just in line with their ethics. They also maintain that countries other than the US should not have SOTA AI capabilities.

Flere-Imsaho2mo ago

If you're a maintainer, you can apply here:

https://claude.com/contact-sales/claude-for-oss

... As mentioned in the article.

1 more reply

malcolmgreaves2mo ago

stale20022mo ago

Better security is a good thing, no a bad thing, regardless of which companies are more difficult to hack. Hemming and hawing over a clear and obvious good is silly.

dragonelite2mo ago

Queue in the "First time" meme.

LiamPowell2mo ago· 12 in thread

    if (x != null) {
        y = *x; // Vulnerability! X could be null!
    }

Is it really so difficult for them to talk about what they've actually achieved without smearing a layer of nonsense over every single blog post?

Edit: See my reply below for why I think Claude is likely to have generated nonsensical bug reports here: https://news.ycombinator.com/item?id=47683336

QuiEgo2mo ago

I agree the wording is a bit alarmist, but a closer example to what they are saying is:

  bool silly_mistake = false;
  
  //... lots of lines of code

  free(x);

  //... lots of lines of code

  if (silly_mistake) { // silly_mistake shown to be false at this point in the program in all testing, so far
     free(x);
  }

A bug like above would still be something that would be patched, even if a way to exploit it has not yet been found, so I think it's fair to call out (perhaps with less sensationalism).

1 more reply

ralph842mo ago

Just because the plane can fly on one engine doesn't mean you don't fix the other engine when it fails.

2 more replies

sophiebits2mo ago

Presumably they mean they could make user code trigger a write out of bounds to kernel memory, but they couldn’t figure out how to escalate privileges in a “useful” way.

1 more reply

red75prime2mo ago

Kernel address space layout randomization they are talking about is a bit different than (x != null). Other bug may allow to locate the required address.

MatejKafka2mo ago

It could very well be an actual reachable buffer overflow, but with KASLR, canaries, CET and other security measures, it's hard to exploit it in a way that doesn't immediately crash the system.

bottlepalm2mo ago

We've very quickly reached the point where AI models are now too dangerous to publicly release, and HN users are still trying to trivialize the situation.

5 more replies

rootkea2mo ago

1 more reply

danielheath2mo ago

Is this code multithreaded? X could indeed be null, in that case.

slopinthebag2mo ago

userbinator2mo ago

deadliftdouche2mo ago

1 more reply

bri3d2mo ago

ofjcihen2mo ago· 9 in thread

I would honestly go so far as to say the overhype is detrimental to actual measured adoption.

qnleigh2mo ago

There is plenty of overhyping, no one denies that. But the antidote is not to dismiss everything. Ignore the words and look at the data.

5 more replies

jstummbillig2mo ago

> how every new iteration is going to spell doom/be a paradigm shift/change the entire tech industry etc.

16 more replies

nbardy2mo ago

There is step changes that actually merit this though. And a zero day machine IS one of those. It went from 4% zero day success rate to 85% on firefox.

Can you not see the significance of that?

1 more reply

alexey-salmin2mo ago

I think Claude Code with Sonnet 4.6 is already at the level of paradigm shift and can change the entire tech industry.

If you're paranoid it doesn't mean you're not being followed. If something is overhyped it doesn't mean it's not game-changing.

1 more reply

nl2mo ago

Well Opus 4.5/4.6 kinda was right?

I mean software development has changed more since then than it has in my 30 year software development career.

jwpapi2mo ago

I agree I can’t open any social media no more

corranh2mo ago

It’s great marketing to lead with how the n+1 model is so amazing that you can’t have it yet.

1 more reply

jillesvangurp2mo ago

> I would honestly go so far as to say the overhype is detrimental to actual measured adoption.

2 more replies

AlexCoventry2mo ago

Do you think they're lying about the vulnerabilities they claim Mythos has found? Seems like a very short-term play, if so.

redfloatplane2mo ago· 9 in thread

The system card for Claude Mythos (PDF): https://www-cdn.anthropic.com/53566bf5440a10affd749724787c89...

Interesting to see that they will not be releasing Mythos generally. [edit: Mythos Preview generally - fair to say they may release a similar model but not this exact one]

I'm still reading the system card but here's a little highlight:

and interestingly:

> To be explicit, the decision not to make this model generally available does _not_ stem from Responsible Scaling Policy requirements.

The threat model in question:

slacktivism1232mo ago

https://www-cdn.anthropic.com/53566bf5440a10affd749724787c89...

"5.10 External assessment from a clinical psychiatrist" is a new section in this system card. Why are Anthropic like this?

6 more replies

yieldcrv2mo ago

they also don't have the compute, which seems more relevant than its large increase in capabilities

I bet it's also misaligned like GPT 4.1 was

given how these models are created, Mythos was probably cooking ever since then, and doesn't have the learnings or alignment tweaks that models which were released in the last several months have

ainch2mo ago

2 more replies

_pdp_2mo ago

If it is that dangerous as they make it appear to be, 24h does not seem sufficient time. I cannot accept this as a serious attempt.

4 more replies

enraged_camel2mo ago

>> Interesting to see that they will not be releasing Mythos generally.

I don't think this is accurate. The document says they don't plan to release the Preview generally.

1 more reply

throwaw122mo ago

are we cooked yet?

Benchmarks look very impressive! even if they're flawed, it still translates to real world improvements

3 more replies

stevenhuang2mo ago

Oh I enjoyed the Sign Painter short story it wrote.

---

Teodor painted signs for forty years in the same shop on Vell Street, and for thirty-nine of them he was angry about it.

"He won't take it," Teodor said.

"It's better," she said.

"It is better," he said. "He won't take it."

He took her to the shelf. She looked at the signs a long time.

"These are beautiful," she said.

"Yes."

"Why are they here?"

He had thought about this for thirty-nine years and had many answers and all of them were about the customers and none of them had ever made him less angry. So he tried a different one.

"Then what's the skill for?"

5 more replies

torginus2mo ago

Just reading this, the inevitable scaremongering about biological weapons comes up.

Since most of us here are devs, we understand that software engineering capabilities can be used for good or bad - mostly good, in practice.

I think this should not be different for biology.

I would like to reach out and talk to biologists - do you find these models to be useful and capable? Can it save you time the way a highly capable colleague would?

Do you think these models will lead to similar discoveries and improvements as they did in math and CS?

9 more replies

cyanydeez2mo ago

[flagged]

1 more reply

9cb14c1ec02mo ago· 8 in thread

Now, its very possible that this is Anthropic marketing puffery, but even if it is half true it still represents an incredible advancement in hunting vulnerabilities.

It could also totally reshape military sigint in similar ways.

Who knows, maybe the sealing off of memory vulns for good will inspire whole new classes of vulnerabilities that we currently don't know anything about.

woeirua2mo ago

You should watch this talk by Nicholas Carlini (security researcher at Anthropic). Everything in the talk was done with Opus 4.6: https://www.youtube.com/watch?v=1sd26pWhfmg

4 more replies

georgemcbay2mo ago

It will likely cause some interesting tensions with government as well.

eg. Apple's official stance per their 2016 customer letter is no backdoors:

https://www.apple.com/customer-letter/

https://en.wikipedia.org/wiki/Apple%E2%80%93FBI_encryption_d...

What happens when that is no longer true, especially in today's political climate?

4 more replies

wanderingmind2mo ago

1 more reply

Gigachad2mo ago

4 more replies

cperciva2mo ago

its very possible that this is Anthropic marketing puffery

It isn't.

1 more reply

tex02mo ago

The interesting selling point about this, if the claims are substantial, is that nobody will be able to produce secure software without access to one of these models. Good for them $$$ ^^

2 more replies

slashdave2mo ago

elnerd2mo ago

Yesterday, I took a web application, downloaded the trial and asked AI to be a security researcher and find me high and critical severity bugs.

Even vanilla models spew out POC for three RCE’s in less than an hour

1 more reply

jryio2mo ago· 7 in thread

Let's fast forward the clock. Does software security converge on a world with fewer vulnerabilities or more? I'm not sure it converges equally in all places.

My understanding is that the pre-AI distribution of software quality (and vulnerabilities) will be massively exaggerated. More small vulnerable projects and fewer large vulnerable ones.

mlinsey2mo ago

I'm pretty optimistic that not only does this clean up a lot of vulns in old code, but applying this level of scrutiny becomes a mandatory part of the vibecoding-toolchain.

The biggest issue is legacy systems that are difficult to patch in practice.

4 more replies

timschmidt2mo ago

Most vulnerabilities seem to be in C/C++ code, or web things like XSS, unsanitized input, leaky APIs, etc.

Perhaps a chunk of that token spend will be porting legacy codebases to memory safe languages. And fewer tokens will be required to maintain the improved security.

1 more reply

lilytweed2mo ago

I think we’re starting to glimpse the world in which those individuals or organizations who pigheadedly want to avoid using AI at all costs will see their vulnerabilities brutally exploited.

2 more replies

cyanydeez2mo ago

I think this entire post is just an advertisement to goad CISOs to buy $package$ to try out.

socketcluster2mo ago

I suspect it will converge on minimal complexity software. Current software is way too bloated. Unnecessary complexity creates vulnerabilities and makes them harder to patch.

1 more reply

pants22mo ago

3 more replies

tdaltonc2mo ago

Depends - do you think people are good at keeping their fridge firmware up-to-date?

2 more replies

josephg2mo ago· 6 in thread

underdeserver2mo ago

I would suggest watching Nicholas Carlini's talk and Heather Adkins and Four Flynn's talks from unprompted:

https://youtu.be/1sd26pWhfmg?si=onOai_ocxkZeNWP0

https://youtu.be/B_7RpP90rUk?si=HkRBhw95DbbKX9lL

My takeaway is that fuzzing is not just complementary, it also gives a stronger AI a starting point. But AI is generally faster and better.

1 more reply

nextos2mo ago

ComplexSystems2mo ago

This line of reasoning makes no sense when the AI can just be given access to a fuzzer. I would guess that it probably did have access to a fuzzer to put together some of these vulnerabilities.

acdha2mo ago

kristofferR2mo ago

AI can initate the fuzzing and optimize the process of fuzzing.

tptacek2mo ago

3 more replies

rakel_rakel2mo ago· 6 in thread

AITA for thinking that PRISM was probably the state sponsored program affecting civilian life the most? And that one state is missing from the list here?

ronsor2mo ago

> Large American AI company does not list the US as an adversarial actor

This is not a surprise or a gotcha.

1 more reply

laweijfmvo2mo ago

I can think of two I’d add to the list. One was recently publicly denied access to Anthropics models and the other was busy exploding pagers.

1 more reply

JumpCrisscross2mo ago

> PRISM was probably the state sponsored program affecting civilian life the most?

The fact that Iran hasn't been able to do diddly squat in America should sink in the fact that they didn't compromise us. (EDIT: blep. I was wrong.)

2 more replies

parthdesai2mo ago

The irony of that statement given the current circumstances

lobochrome2mo ago

How did PRISM affect civilian life?

1 more reply

neonstatic2mo ago

Look, we have always been at war with EastAsia.

atlgator2mo ago· 5 in thread

[flagged]

j2kun2mo ago

Which bug?

[edit]: this bug: https://ftp.openbsd.org/pub/OpenBSD/patches/7.8/common/025_s...

IsTom2mo ago

FFmpeg has a lot of weird and not widely used codecs that don't get a lot of scrutiny. If there's no specifics then it could be a bug in one them.

2 more replies

l5agh2mo ago

This was the top comment and it is suddenly flagged for no reason at all. It looks like meta-flagging, where people just want to hide replies to the comment they do not want you to read.

The amount of astroturfing and astroflagging in Anthropic threads is insane.

rlopc2mo ago

These issues are always found in the same kinds of projects that support an insane amount of largely unused protocols and features like ffmpeg, sudo, curl.

OpenBSD has many unexplored corners and also (irresponsibly IMO) maintains forks of other projects in base.

A motivated human could find all of these probably by writing 100% code coverage and fuzzing.

The market for these tools is very small. Good luck applying them to a release of sqlite or postfix.

kranke1552mo ago

3 more replies

impulser_2mo ago· 5 in thread

So they are only giving access to their smartest model to corporations.

You think these AI companies are really going to give AGI access to everyone. Think again.

We better fucking hope open source wins, because we aren't getting access if it doesn't.

open5922mo ago

Then the next lab catches up and releases it more broadly

Then later the open weights model is released.

The only way this type of technology is going to be gated "to only corporations" is if we continue on this exponential scaling trend as the "SOTA" model is always out of reach.

1 more reply

dreis_sw2mo ago

It also took many years to put capable computers in the hands of the general public, but it eventually happened. I believe the same will happen here, we're just in the Mainframe era of AI.

1 more reply

justincormack2mo ago

And the Linux Foundation.

dievskiy2mo ago

Would you hope that it would be released today so that evil actors could invest few millions to search for 0days across popular open-source repos?

throwaw122mo ago

of course they're not giving access to everyone.

they better make billions directly from corporations, instead of giving them to average people who might get a chance out of poverty (but also bad actors using it to do even more bad things)

1 more reply

steinwinde2mo ago· 3 in thread

Not a single word of caution regarding possible abuse. Instead apparent support for its "offensive" capabilities.

khafra2mo ago

> what does Anthropic do to prevent malicious use of its software by its own government?

Anthropic has ameliorated that danger by being designated a supply-chain risk by the DoW, preventing the USG from using it.

alexey-salmin2mo ago

saretup2mo ago

Even more 'disquieting' when you take into account who's currently the president of US.

"A whole civilization will die tonight, never to be brought back again. I don’t want that to happen, but it probably will." - Donald Trump

3 more replies

cbg02mo ago· 3 in thread

One of the things I'm always looking at with new models released is long context performance, and based on the system card it seems like they've cracked it:

  GraphWalks BFS 256K-1M

  Mythos     Opus     GPT5.4

  80.0%     38.7%     21.4%

metadat2mo ago

Data source:

https://www-cdn.anthropic.com/53566bf5440a10affd749724787c89...

(Search for “graphwalk”.)

If true, the SWE bench performance looks like a major upgrade.

radicality2mo ago

1 more reply

himata41132mo ago

this seems to be similar to gpt-pro, they just have a very large attention window (which is why it's so expensive to run) true attention window of most models is 8096 tokens.

2 more replies

temp1237892462mo ago· 3 in thread

OpenAI initially claimed that GPT-2 was too dangerous to release in 2019.

How many times will labs repeat the same absurd propaganda?

uselessTA2mo ago

The claim I remember was that releasing it would start an arms race for AGI, which I think it clearly did

SubiculumCode2mo ago

Anthropic and OpenAI have very different cultures and ethos. Point to other times where anthropic has gone the way of cheap marketing tricks. Now look at openAI. Not even close.

1 more reply

bitwize2mo ago

OpenAI did not make the strong specific claims about GPT2's abilities that Anthropic is making about Claude Mythos.

zachperkel2mo ago· 3 in thread

Mythos Preview has already found thousands of high-severity vulnerabilities, including some in every major operating system and web browser.

Scary but also cool

ex-aws-dude2mo ago

Did someone actually go through all of those and check if they are high-severity or did the AI just tell them that?

1 more reply

fsflover2mo ago

1 more reply

dakolli2mo ago

Or more likely, its just an exaggeration or lie.

2 more replies

meander_water2mo ago· 3 in thread

I think this is a largely inflated PR stunt.

Opus 4.6 was already capable of finding 0days and chaining together vulns to create exploits. See [0] and [1].

[0] https://www.csoonline.com/article/4153288/vim-and-gnu-emacs-...

[1] https://xbow.com/blog/top-1-how-xbow-did-it

solenoid09372mo ago

Absolutely not a PR stunt, talk to one of your friends working at partner companies with access to the model

ofjcihen2mo ago

pertymcpert2mo ago

Did you read the article?

dakolli2mo ago· 3 in thread

If AGI is going to be a thing its only going to be a thing, its only going to be a thing for fortune 100 companies..

However, my guess is this is mostly the typical scare tactic marketing that Dario loves to push about the dangers of AI.

supern0va2mo ago

>However, my guess is this is mostly the typical scare tactic marketing that Dario loves to push about the dangers of AI.

Evaluate it yourself. Look at the exploits it discovered and decide whether you want to feel concerned that a new model was able to do that. The data is right there.

1 more reply

rvz2mo ago

Well, Yes.

This is marketing wrapped around a biased research paper.

dist-epoch2mo ago

The plan of Elon Musk for Macrohard is to replace all software companies with it, when they get AGI.

1 more reply

Miraste2mo ago· 3 in thread

>We plan to launch new safeguards with an upcoming Claude Opus model, allowing us to improve and refine them with a model that does not pose the same level of risk as Mythos Preview2.

This seems like the real news. Are they saying they're going to release an intentionally degraded model as the next Opus? Big opportunity for the other labs, if that's true.

SheinhardtWigCo2mo ago

The other labs already censor their models. Everyone is trying to find the sweet spot where performance and ‘alignment’ are both maximized. This seems no different

wslh2mo ago

> Big opportunity for the other labs, if that's true.

zb32mo ago

Well since Anthropic treats us as second class evil citizens, I guess they don't want our evil money either.

SheinhardtWigCo2mo ago· 3 in thread

Society is about to pay a steep price for the software industry's cavalier attitude toward memory safety and control flow integrity.

titzer2mo ago

doug_durham2mo ago

2 more replies

torginus2mo ago

Thank god, finally someone said it.

I don't know the first thing about cybersecurity, but in my experience all these sandbox-break RCEs involve a step of highjacking the control flow.

2 more replies

endunless2mo ago· 3 in thread

Another Anthropic PR release based on Anthropic’s own research, uncorroborated by any outside source, where the underlying, unquestioned fact is that their model can do something incredible.

> AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities

I like Anthropic, but these are becoming increasingly transparent attempts to inflate the perceived capability of their products.

NitpickLawyer2mo ago

Analemma_2mo ago

1 more reply

conradkay2mo ago

I would've basically agreed with you until I'd seen this talk: https://www.youtube.com/watch?v=1sd26pWhfmg

Maybe a bad example since Nicholas works at Anthropic, but they're very accomplished and I doubt they're being misleading or even overly grandiose here

See the slide 13 minutes in, which makes it look to be quite a sudden change

2 more replies

sam0x172mo ago· 2 in thread

It's all just really genius marketing. In 6 months Mythos will be nothing special, but right now everyone is being manipulated into fearing its release, as a marketing ploy.

This is the same reason AI founders perennially worry in public that they have created AGI...

declan_roberts2mo ago

I'm sure all they've done here is spend unlimited tokens to find bugs in mostly open source projects (and fuzz some closed source ones).

sam0x172mo ago

It's effectively 2026's version of "Doctors hate this one weird trick!"

1 more reply

josh-sematic2mo ago· 2 in thread

Must be nice to be in a position to sell both disease and cure.

tptacek2mo ago

That's exactly not what they're doing. They aren't creating operating system vulnerabilities. They're telling you about ones that already existed.

3 more replies

supern0va2mo ago

Yeah, I'd pretty pissed at my doctor for finding cancerous cells that probably wouldn't have been a problem for quite some time, either. Ignorance is bliss, security through obscurity, whatever.

2 more replies

Ryan5453OP2mo ago· 2 in thread

Pricing for Mythos Preview is $25/$125, so cheaper than GPT 4.5 ($75/$150) and GPT 5.4 Pro ($30/$180)

conradkay2mo ago

For comparison, 5x the cost of Opus 4.6, and 1.67x for Opus 4.1

I think this would be very heavily used if they released it, completely unlike GPT 4.5

1 more reply

cassianoleal2mo ago

Where did you get that from?

From TFA:

> We do not plan to make Claude Mythos Preview generally available

1 more reply

taupi2mo ago· 2 in thread

Part of me wonders if they're not releasing it for safety reasons, but just because it's too expensive to serve. Why not both?

wyre2mo ago

I don't think they have the infra to support the demand. Anthropic can't keep up with the demand from OpenClaw users, they won't be able to keep up with public demand for something like Mythos.

coffeebeqn2mo ago

If these numbers are correct it’s probably worth the extra price

dang2mo ago· 2 in thread

Related ongoing threads:

System Card: Claude Mythos Preview [pdf] - https://news.ycombinator.com/item?id=47679258

Assessing Claude Mythos Preview's cybersecurity capabilities - https://news.ycombinator.com/item?id=47679155

I can't tell which of the 3 current threads should be merged - they all seem significant. Anyone?

aurareturn2mo ago

Let them all live. This is going to blow up one thread if you merge them.

HPMOR2mo ago

I think merging them into either this thread, or the System Card makes the most sense to me.

Sol-2mo ago· 2 in thread

rishabhaiover2mo ago

1 more reply

wilson0902mo ago

Inference is where they make the money they spend on training, so this feels unlikely. Perhaps this does not true for Mythos though

anVlad112mo ago· 2 in thread

unethical_ban2mo ago

I'm sympathetic to your point, but I'm sure there are heightened trust levels between the participating orgs and confidentiality agreements out the wazoo.

What has it stopped you from doing?

1 more reply

SheinhardtWigCo2mo ago

Yes, and that's normal. Coordinated disclosure is standard practice when the risk of public disclosure is unacceptable.

1 more reply

LoganDark2mo ago· 2 in thread

It's nice to know that they continue to be committed to advertising how safe and ethical they are.

raldi2mo ago

In what ways is Anthropic different from a hypothetical frontier lab that you would characterize as legitimately safe and ethical?

2 more replies

rvz2mo ago

They are not our friends and are the exact opposite of what they are preaching to be.

Let alone their CEO scare mongering and actively attempting to get the government to ban local AI models running on your machine.

2 more replies

4qt232mo ago· 2 in thread

Software has been doing fine without Misanthropic. These automated tools find very little. They selected the partners because they, too, want to keep up the illusion that AI works.

Whenever a company pivots to "cyber" rhetoric, it is a clear indication that they are selling snake oil.

Secure your girl school target selectors first.

borski2mo ago

This is a comment from someone that has never used these tools for vulnerability research. That much is very clear.

1 more reply

emceestork2mo ago

Account created 6 minutes ago...

1 more reply

ssgodderidge2mo ago· 1 in thread

At the very bottom of the article, they posted the system card of their Mythos preview model [1].

In section 7.6 of the system card, it discusses Open self interactions. They describe running 200 conversations when the models talk to itself for 30 turns.

I wonder if this tendency toward uncertainty, toward questioning, makes it uniquely equipped to detect vulnerabilities where others model such as Opus couldn't.

[1] https://www-cdn.anthropic.com/53566bf5440a10affd749724787c89...

dakolli2mo ago

Typical Dario marketing BS to get everyone thinking Anthropic is on the verge of AGI and massaging the narrative that regular people can't be trusted with it.

2 more replies

agrishin2mo ago· 1 in thread

How long would it take to turn a defensive mechanism into an offensive one?

SheinhardtWigCo2mo ago

In this case there is almost no distinction. Assuming the model is as powerful as claimed, someone with access to the weights could do immense damage without additional significant R&D.

2 more replies

gck12mo ago· 1 in thread

I chuckle every time <insert any LLM company here> says something in line of "the model is so good that we won't release it to general public, ekhm, because safety".

Because the exact same thing has been said on every single upcoming model since GPT 3.5.

At this point, this must be an inside joke to do this just because.

rvz2mo ago

This how Anthropic is marketing their AI releases and the reality is, they are terrified of local AI models competing against them.

So it is just marketing wrapped around creating fear to get local AI models banned.

2 more replies

stephc_int132mo ago· 1 in thread

I think this is bad news for hackers, spyware companies and malware in general.

We all knew vulnerabilities exist, many are known and kept secret to be used at an appropriate time.

There is a whole market for them, but more importantly large teams in North Korea, Russia, China, Israel and everyone else who are jealously harvesting them.

I believe automation can help here too, and we may end-up with a considerably stronger and reliable software stack.

tptacek2mo ago

1 more reply

bredren2mo ago· 1 in thread

Can anyone point at the critical vulnerabilities already patched as a result of mythos? (see 3:52 in the video)

For example, the 27 year old openbsd remote crash bug, or the Linux privilege escalation bugs?

I know we've had some long-standing high profile, LLM-found bugs discussed but seems unlikely there was speculation they were found by a previously unannounced frontier model.

[0] https://www.youtube.com/watch?v=INGOC6-LLv0

ollin2mo ago

- The OpenBSD one is 'TCP packets with invalid SACK options could crash the kernel' https://cdn.openbsd.org/pub/OpenBSD/patches/7.8/common/025_s...

- One (patched) Linux kernel bug is 'UaF when sys_futex_requeue() is used with different flags' https://github.com/torvalds/linux/commit/e2f78c7ec1655fedd94...

1 more reply

simonw2mo ago· 1 in thread

From Willy Tarreau, lead developer of HA Proxy: https://lwn.net/Articles/1065620/

> And we're now seeing on a daily basis something that never happened before: duplicate reports, or the same bug found by two different people using (possibly slightly) different tools.

From Daniel Stenberg of curl: https://mastodon.social/@bagder/116336957584445742

> The challenge with AI in open source security has transitioned from an AI slop tsunami into more of a ... plain security report tsunami. Less slop but lots of reports. Many of them really good.

> I'm spending hours per day on this now. It's intense.

From Greg Kroah-Hartman, Linux kernel maintainer: https://www.theregister.com/2026/03/26/greg_kroahhartman_ai_...

> Months ago, we were getting what we called 'AI slop,' AI-generated security reports that were obviously wrong or low quality. It was kind of funny. It didn't really worry us.

> Something happened a month ago, and the world switched. Now we have real reports. All open source projects have real reports that are made with AI, but they're good, and they're real.

Shared some more notes on my blog here: https://simonwillison.net/2026/Apr/7/project-glasswing/

ofjcihen2mo ago

Could this potentially be because more researches are becoming accustomed to the tools/adding them in their pipelines?

underdeserver2mo ago· 1 in thread

Interesting also is what they didn't find, e.g. a Linux network stack remote code execution vulnerability. I wonder if Mythos is good enough that there really isn't one.

NickJLange2mo ago

Linux had it's SACK moment in 2019 - https://access.redhat.com/security/vulnerabilities/tcpsack#s...

We could just be seeing the fruit of expensive SWE RL on existing source material.

kristofferR2mo ago· 1 in thread

charcircuit2mo ago

These claims of how much harm the models will cause is always overblown.

2 more replies

jiusanzhou2mo ago· 1 in thread

jusling2mo ago

so it looks like ai-slop replies have made their way to HN...

1 more reply

baddash2mo ago· 1 in thread

> security product

> glass in the name

pugworthy2mo ago

I had a team mate propose a new security layer for an industrial device which he wanted to call "Eggshell"

1 more reply

oyebenny2mo ago· 1 in thread

why do I feel like the auditing industry is about to evaporate? thanks to this.

KeplerBoy2mo ago

I guess the more likely option is the auditing industry will pay huge sums to get access to those models as vetted operators.

Fokamul2mo ago· 1 in thread

+ NSA, CIA

nikcub2mo ago

Department of War timing on picking fights couldn't be worse

SirYandi2mo ago· 1 in thread

solenoid09372mo ago

What interest does Apple have in boosting Mythos?

anuramat2mo ago· 1 in thread

"oops, our latest unreleased model is so good at hacking, we're afraid of it! literal skynet! more literal than the last time!"

almost like they have an incentive to exaggerate

knowaveragejoe2mo ago

I'm sure they do, yet the models really are getting scarily good at this. This talk changed my view on where we're actually at:

https://www.youtube.com/watch?v=1sd26pWhfmg

throwaway133372mo ago· 1 in thread

I really wanted to like anthropic. They seem the most moral, for real.

But at the core of anthropic seems to be the idea that they must protect humans from themselves.

They advocate government regulations of private open model use. They want to centralize the holding of this power and ban those that aren't in the club from use.

They, like most tech companies, seem to lack the idea that individual self-determination is important. Maybe the most important thing.

dralley2mo ago

That is unequivocally true with some things. You don't want people exercising their "self-determination" to own private nukes.

1 more reply

burntcaramel2mo ago

Who gates access to the circle? Anthropic or existing circle members or some other governance? If you are outside the circle will you be certain to die from software diseases?

I think we have found the moat for AI. The question is are you inside or outside the castle walls?

3 more replies

ilaksh2mo ago

1 more reply

chenzhekl2mo ago

6 more replies

picafrost2mo ago

As Iran engages in a cyber attack campaign [1] today the timing of this release seems poignant. A direct challenge to their supply chain risk designation.

[1] https://www.cisa.gov/news-events/cybersecurity-advisories/aa...

skerit2mo ago

I'm sure it'll be better than Opus 4.6, but so much of this seems hype. Escaping its sandbox, having to do "brain scans" because it's "hiding its true intent", bla bla bla.

If it manages to work on my java project for an entire day without me having to say "fix FQN" 5 times a day I'll be surprised.

navilai2mo ago

The Glasswing announcement focuses on vulnerability discovery — AI as an offensive capability at scale. That part is getting lots of attention.

That's not a capability concern. That's a runtime security problem.

eranation2mo ago

Few thoughts

CVE-2026-35020 [3] CVE-2026-35021 [4] CVE-2026-35022 [5]

Not making any opinion, just thought it's worth sharing, for some perspective.

[0] https://red.anthropic.com/2026/mythos-preview/

[1] https://www.openbsd.org/errata78.html (look for 025)

[2] https://www.cve.org/Resources/General/Towards-a-Common-Enume...

[3] https://www.cve.org/CVERecord?id=CVE-2026-35020

[4] https://www.cve.org/CVERecord?id=CVE-2026-35021

[5] https://www.cve.org/CVERecord?id=CVE-2026-35022

Edit: if it was not obvious, these CVEs on Claude Code were found by an independent security researcher (Phoenix security) and not by Anthropic / Mythos.

1 more reply

aurizon2mo ago

sensanaty2mo ago

[1] https://status.claude.com/

1 more reply

Apylon7772mo ago

Maybe Anthropic could fix these 5k reported issue with the current claude-code instead of making hyperbolic claims about their new whizbang model.

https://github.com/anthropics/claude-code/issues

jFriedensreich2mo ago

lifeisstillgood2mo ago

Nicolas Carlini talks about it here on Security, Cryptography, Whatever podcast - https://podcasts.apple.com/gb/podcast/security-cryptography-...

Sateeshm2mo ago

The bars have solid fill for Mythos and cross shaded for Opus 4.6. Makes the difference feel more than it actually is.

modeless2mo ago

asdewqqwer2mo ago

There is a huge gap between the shining examples and actual use case: What is the false positive rate? How to judge false positive?

A broken clock being correct twice a day is not impressive.

1 more reply

solid_fuel2mo ago

1 more reply

wslh2mo ago

kukkeliskuu2mo ago

I find it believable that this could potentially happen, although I am not sure the difference is so huge to existing models.

I used Opus 4.6 to find security vulnerabilities in couple of my own projects, it found 33 vulnerabilities in one largeish django project.

The prompt wasn't even that impressive, just telling it to find vulnerabilities from certain files, and referring to OWASP. Then looping that.

NickNaraghi2mo ago

Sounds like we've entered a whole new era, never mind the recent cryptographic security concerns.

rossjudson2mo ago

Security by obscurity is over. The security vs usability balance is about to get a hard reset.

I think a number of black swan events are imminent, and it will substantially change the financial calculus that decides to put security behind revenue.

Any hole will be found, and any hole will be exploited. Plug as many holes as you can, and make lateral movement as painful as possible.

zb32mo ago

Rover2222mo ago

punnerud2mo ago

Simon Willis (guy behind Django) told about this 5days ago (19min in): https://youtu.be/wc8FBhQtdsA?si=OeA5qzbWGqDY8Vu4

bdeol222mo ago

VadimPR2mo ago

1 - https://www.anthropic.com/news/mozilla-firefox-security 2 - https://neuronad.com/ai-news/claude-code-unearthed-a-23-year...

cryptoegorophy2mo ago

Ironically Claude cli completely failed to detect a rogue code on my html scan yesterday while ChatGPT web version detected it immediately. Can’t wait to do same test with newer version.

willamhou2mo ago

zambelli2mo ago

I'm glad to see that it stands its ground more than other models - which is a genuinely useful trait for an assistant. Both on technical and emotional topics.

DigitalArchivst2mo ago

Do folks recommend that family and friends ensure their systems are updated, and that they are using Bitwarden or 1Password? Or is that alarmist?

caycep2mo ago

When do we get our Kuang Grade Mark Eleven icebreaker?

tombelieber2mo ago

I think this new model will empower everyone in the world to have higher quality of software, more secure software. not less

attentive2mo ago

Is there timeline mentioned anywhere on when any of this will be available for unprivileged public as in soon, not soon, never?

mlvvkviz2mo ago

ahmaman2mo ago

Moving forward, wonder if such AI capabilities would widen the security gap between open-source software vs. proprietary?

zb32mo ago

Yeah, makes sense. Those countries are bad because they execute state-sponsored cyber attacks, the US and Israel on the other hand are good, they only execute state-sponsored defense.

User232mo ago

How much of Mythos’s internals will researchers be able to recover from the flood of patches?

rubises2mo ago

5d41402abc4b2mo ago

Are there any local models that i can setup to run on my code as part of CI?

wanderingmind2mo ago

1 more reply

kmfrk2mo ago

Heck of a Patch Tuesday.

MisterBiggs2mo ago

What happens once an agent can reliably get 100% on swebench?

yalogin2mo ago

Has anyone played with the released versions of Claude and tried to create exploits? I cannot imagine it not being able to craft one if guided, unless the tooling around it doesn’t allow it

spprashant2mo ago

We final have the answer to the question, when do these labs stop giving away intelligence to the general public for $20 a month?

Selling shovels in now worth less than taking all the gold for themselves.

waffletower2mo ago

paoliniluis2mo ago

Does everyone agrees that this makes Dario Amodei more powerful than any politician across the world? Anthropic is now the owner of the most powerful cyberweapon ever made

cerved2mo ago

Anthropic should run it on their own code

maxmaio2mo ago

seems important and terrifying. This morning Opus 4.6 was blowing my mind in claude code... onward and upward

finchisko2mo ago

Wait, isn't it how Skynet started?

throwaway9112822mo ago

Pumping is taken to a new level.. the model is God like that it can't be released as it is.. this must be a joke.

nickandbro2mo ago

I want it

copypaper2mo ago

Yea, but can it secure systems from the unpatchable $5 wrench vulnerability?

https://xkcd.com/538/

Mecha_SalesCast2mo ago

we should notice that we've already reached the point where AI models are too dangerous to publicly release

cdelsolar2mo ago

let us have mythos damn it

0xbadcafebee2mo ago

tl;dr we find vulns so we can help big companies fix their security holes quickly (and so they can profit off it)

asG112mo ago

This will fizzle out and the weirdo will have to pivot to their next marketing scheme.

Surac2mo ago

namedropping hell.

yusufozkan2mo ago

but people here had told me llms just predict the next word

jaspanglia2mo ago

what they will eventually do is, deliberately have more control what people wants and working for. We don't trust such institutions after witnessing GATES thuggery all over.

cmiles82mo ago

Got it.

6thbit2mo ago

This is silly and disingenuous. In a matter of days or weeks a competing lab will make public a model with capabilities beyond this “mythos” one.

Is this a huge fear-driven marketing stunt to get governments and corporations into dealing with anthropic?

gnarlouse2mo ago

A cybersecurity pandemic will surely be the Hiroshima that wakes people up to AI. /s

Ms-J2mo ago

Anthropic and ClosedAI are some of the biggest bullshitters in the industry.

ehutch792mo ago

Just include 'make it secure' in the prompt. Duh.

lasky2mo ago

The hype machine is alive and well in silicon valley.

cmiles82mo ago

I’m sure it’s a decent model. But it’s also clear folks are running out of runway and desperate to find something that sticks and keeps the party going.

Meanwhile the folks let in on the “secret” are those that also desperately need for the hype to continue to protect their own positions in this game.

Look forward to a model upgrade but the hype fluff games are getting old. Watching OpenAI completely crash out of pole position on the hype train though has been at least amusing.

123malware3212mo ago

I don't know anyone reviewing these tools that is impressed who is also someone who earns they paycheck doing bugbounties and finding actual CVE.

Generally these things only find memory corruption stuff which is almost never the type of bug you're looking for, and it costs a lot which negates your bug bounty payout.

Each time they preach, ooh, 0day found, bla bla.

In this domain you need to be specific or you are just yelling clickbait into the wind.

What type of 0day, what did the exploit actually look like.

Get a dopamine hit, post on reddit, LOL. Hacking the planet (powered by Claude -_-)

tdaltonc2mo ago

> Mythos finds bug.

> NSA demands that bug stays in place and gags Anthropic.

> Anthropic releases Mythos.

Then what? Is a huge share of the US zero-day stockpiles about to be disarmed or proliferated?

dakolli2mo ago

If this is as dangerous as they make it out (its not), why would their first impulse be to get every critical products/system/corporation in the world to implement its usage?

manbash2mo ago

This will likely not see the light of day. It's the usual PR that gathers many "partnerships".

Expect to see lots of these in the upcoming months as the big companies scramble to keep from losing money.

imranahmedjak2mo ago

j / k navigate · click thread line to collapse