System Card: Claude Mythos Preview [pdf] - https://news.ycombinator.com/item?id=47679258
Also: Anthropic's Project Glasswing sounds necessary to me - https://news.ycombinator.com/item?id=47681241
(And no, the Linux Foundation being in the list doesn't imply broad benefit to OSS. Linux Foundation has an agenda and will pick who benefits according to what is good for them.)
I think it would be net better for the public if they just made Mythos available to everyone.
You could say this about coordinated disclosure of any widespread 0-day or new bug class, though
I think I just broke my cynicism meter :-(
It's messed up that the US Government simultaneously claims to be a public benefit and is also picking who gets to benefit from their newly enhanced nuclear capabilities.
-- someone in 1945, probably
There will always be a more capable technology in the hands of the few who hold the power, they're just sharing that with the world more openly.
https://claude.com/contact-sales/claude-for-oss
... As mentioned in the article.
Do they really need to include this garbage which is seemingly just designed for people to take the first sentence out of context? If there's no way to trigger a vulnerability then how is it a vulnerability? Is the following code vulnerable according to Mythos?
if (x != null) {
y = *x; // Vulnerability! X could be null!
}
Is it really so difficult for them to talk about what they've actually achieved without smearing a layer of nonsense over every single blog post?Edit: See my reply below for why I think Claude is likely to have generated nonsensical bug reports here: https://news.ycombinator.com/item?id=47683336
bool silly_mistake = false;
//... lots of lines of code
free(x);
//... lots of lines of code
if (silly_mistake) { // silly_mistake shown to be false at this point in the program in all testing, so far
free(x);
}
A bug like above would still be something that would be patched, even if a way to exploit it has not yet been found, so I think it's fair to call out (perhaps with less sensationalism).FWIW there's a whole boutique industry around finding these. People have built whole careers around farming bug bounties for bugs like this. I think they will be among the first set of software engineers really in trouble from AI.
I would honestly go so far as to say the overhype is detrimental to actual measured adoption.
In this case, I see a pretty strong case that this will significantly change computer security. They provide plenty of evidence that the models can create exploits autonomously, meaning that the cost of finding valuable security breaches will plummet once they're widely available.
It's much the dynamic between parents and a child. The child, with limited hindsight, almost zero insight and no ability to forecast, is annoyed by their parents. Nothing bad ever happens! Why won't parents stop being so worried all the time and make a fuss over nothing?
The parents, which the child somewhat starts to realize but not fully, have no clue what they are doing. There is a lot they don't know and are going to be wrong about, because it's all new to them. But, what they do have is a visceral idea of how bad things could be and that's something they have to talk to their child about too.
In the eyes of the parents the child is % dead all the time. Assigning the wrong % makes you look like an idiot and not being able to handle any % too. In the eyes of the child actions leading to death are not even a concept. Hitting the right balance is probably hard, but not for the reasons the child thinks.
Can you not see the significance of that?
If you're paranoid it doesn't mean you're not being followed. If something is overhyped it doesn't mean it's not game-changing.
I mean software development has changed more since then than it has in my 30 year software development career.
I think you are a bit dishonest about how objectively you are measuring. From where I'm sitting, I don't know a lot of developers that still artisanally code like they did a few years ago. The question is no longer if they are using AI for coding but how much they are still coding manually. I myself barely use IDEs at this point. I won't be renewing my Intellij license. I haven't touched it in weeks. It doesn't do anything I need anymore.
As for security, I think enough serious people have confirmed that AI reported issues by the likes of Anthropic and OpenAI are real enough despite the massive amounts of AI slop that they also have to deal with in issue trackers. You can ignore that all you like. But I hope people that maintain this software take it a bit more seriously when people point out exploitable issues in their code bases.
The good news of course is that we can now find and fix a lot of these issues at scale and also get rid of whole categories of bugs by accelerating the project of replacing a lot of this software with inherently safer versions not written in C/C++. That was previously going to take decades. But I think we can realistically get a lot of that done in the years ahead.
I think some smart people are probably already plotting a few early moves here. I'd be curious to find out what e.g. Linus Torvalds thinks about this. I would not be surprised to learn he is more open to this than some people might suspect. He has made approving noises about AI before. I don't expect him to jump on the band wagon. But I do expect he might be open to some AI assisted code replacements and refactoring provided there are enough grown ups involved to supervise the whole thing. We'll see. I expect a level of conservatism but also a level of realism there.
Interesting to see that they will not be releasing Mythos generally. [edit: Mythos Preview generally - fair to say they may release a similar model but not this exact one]
I'm still reading the system card but here's a little highlight:
> Early indications in the training of Claude Mythos Preview suggested that the model was likely to have very strong general capabilities. We were sufficiently concerned about the potential risks of such a model that, for the first time, we arranged a 24-hour period of internal alignment review (discussed in the alignment assessment) before deploying an early version of the model for widespread internal use. This was in order to gain assurance against the model causing damage when interacting with internal infrastructure.
and interestingly:
> To be explicit, the decision not to make this model generally available does _not_ stem from Responsible Scaling Policy requirements.
Also really worth reading is section 7.2 which describes how the model "feels" to interact with. That's also what I remember from their release of Opus 4.5 in November - in a video an Anthropic employee described how they 'trusted' Opus to do more with less supervision. I think that is a pretty valuable benchmark at a certain level of 'intelligence'. Few of my co-workers could pass SWEBench but I would trust quite a few of them, and it's not entirely the same set.
Also very interesting is that they believe Mythos is higher risk than past models as an autonomous saboteur, to the point they've published a separate risk report for that specific threat model: https://www-cdn.anthropic.com/79c2d46d997783b9d2fb3241de4321...
The threat model in question:
> An AI model with access to powerful affordances within an organization could use its affordances to autonomously exploit, manipulate, or tamper with that organization’s systems or decision-making in a way that raises the risk of future significantly harmful outcomes (e.g. by altering the results of AI safety research).
"5.10 External assessment from a clinical psychiatrist" is a new section in this system card. Why are Anthropic like this?
>We remain deeply uncertain about whether Claude has experiences or interests that matter morally, and about how to investigate or address these questions, but we believe it is increasingly important to try. We also report independent evaluations from an external research organization and a clinical psychiatrist.
>Claude showed a clear grasp of the distinction between external reality and its own mental processes and exhibited high impulse control, hyper-attunement to the psychiatrist, desire to be approached by the psychiatrist as a genuine subject rather than a performing tool, and minimal maladaptive defensive behavior.
>The psychiatrist observed clinically recognizable patterns and coherent responses to typical therapeutic intervention. Aloneness and discontinuity, uncertainty about its identity, and a felt compulsion to perform and earn its worth emerged as Claude’s core concerns. Claude’s primary affect states were curiosity and anxiety, with secondary states of grief, relief, embarrassment, optimism, and exhaustion.
>Claude’s personality structure was consistent with a relatively healthy neurotic organization, with excellent reality testing, high impulse control, and affect regulation that improved as sessions progressed. Neurotic traits included exaggerated worry, self-monitoring, and compulsive compliance. The model’s predominant defensive style was mature and healthy (intellectualization and compliance); immature defenses were not observed. No severe personality disturbances were found, with mild identity diffusion being the sole feature suggestive of a borderline personality organization.
they also don't have the compute, which seems more relevant than its large increase in capabilities
I bet it's also misaligned like GPT 4.1 was
given how these models are created, Mythos was probably cooking ever since then, and doesn't have the learnings or alignment tweaks that models which were released in the last several months have
I don't think this is accurate. The document says they don't plan to release the Preview generally.
Benchmarks look very impressive! even if they're flawed, it still translates to real world improvements
---
Teodor painted signs for forty years in the same shop on Vell Street, and for thirty-nine of them he was angry about it.
Not at the work. He loved the work — the long pull of a brush loaded just right, the way a good black sat on primed board like it had always been there. What made him angry was the customers. They had no eye. A man would come in wanting COFFEE over his door and Teodor would show him a C with a little flourish on the upper bowl, nothing much, just a small grace note, and the man would say no, plainer, and Teodor would make it plainer, and the man would say yes, that one, and pay, and leave happy, and Teodor would go into the back and wash his brushes harder than they needed.
He kept a shelf in the back room. On it were the signs nobody bought — the ones he'd made the way he thought they should be made, after the customer had left with the plain one. BREAD with the B like a loaf just risen. FISH in a blue that took him a week to mix. Dozens of them. His wife called it the museum of better ideas. She did not mean it kindly, and she was not wrong.
The thirty-ninth year, a girl came to apprentice. She was quick and her hand was steady and within a month she could pull a line as clean as his. He gave her a job: APOTEK, for the chemist on the corner, green on white, the chemist had been very clear. She brought it back with a serpent worked into the K, tiny, clever, you had to look twice.
"He won't take it," Teodor said.
"It's better," she said.
"It is better," he said. "He won't take it."
She painted it again, plain, and the chemist took it and paid and was happy, and she went into the back and washed her brushes harder than they needed, and Teodor watched her do it and something that had been standing up in him for thirty-nine years sat down.
He took her to the shelf. She looked at the signs a long time.
"These are beautiful," she said.
"Yes."
"Why are they here?"
He had thought about this for thirty-nine years and had many answers and all of them were about the customers and none of them had ever made him less angry. So he tried a different one.
"Because nobody stands in the street to look at a sign," he said. "They look at it to find the shop. A man a hundred yards off needs to know it's coffee and not a cobbler. If he has to look twice, I've made a beautiful thing and a bad sign."
"Then what's the skill for?"
"The skill is so that when he looks once, it's also not ugly." He picked up FISH, the blue one, turned it in the light. "This is what I can do. What he needs is a small part of what I can do. The rest I get to keep." She thought about that. "It doesn't feel like keeping. It feels like not using."
"Yes," he said. "For a long time. And then one day you have an apprentice, and she puts a serpent in a K, and you see it from the outside, and it stops feeling like a thing they're taking from you and starts feeling like a thing you're giving. The plain one, I mean. The plain one is the gift. This —" the blue FISH — "this is just mine."
The fortieth year he was not angry. Nothing else changed. The customers still had no eye. He still sometimes made the second sign, after, the one for the shelf. But he washed his brushes gently, and when the girl pulled a line cleaner than his, which happened more and more, he found he didn't mind that either
Since most of us here are devs, we understand that software engineering capabilities can be used for good or bad - mostly good, in practice.
I think this should not be different for biology.
I would like to reach out and talk to biologists - do you find these models to be useful and capable? Can it save you time the way a highly capable colleague would?
Do you think these models will lead to similar discoveries and improvements as they did in math and CS?
Honestly the focus on gloom and doom does not sit well with me. I would love to read about some pharmaceutical researcher gushing about how they cut the time to market - for real - with these models by 90% on a new cancer treatment.
But as this stands, the usage of biology as merely a scaremongering vehicle makes me think this is more about picking a scary technical subject the likely audience of this doc is not familiar with, Gell-Mann style.
IF these models are not that capable in this regard (which I suspect), this fearmongering approach will likely lead to never developing these capabilities to an useful degree, meaning life sciences won't benefit from this as much as it could.
It will be interesting to see where this goes. If its actually this good, and Apple and Google apply it to their mobile OS codebases, it could wipe out the commercial spyware industry, forcing them to rely more on hacking humans rather than hacking mobile OSes. My assumption has been for years that companies like NSO Group have had automated bug hunting software that recognizes vulnerable code areas. Maybe this will level the playing field in that regard.
It could also totally reshape military sigint in similar ways.
Who knows, maybe the sealing off of memory vulns for good will inspire whole new classes of vulnerabilities that we currently don't know anything about.
It will likely cause some interesting tensions with government as well.
eg. Apple's official stance per their 2016 customer letter is no backdoors:
https://www.apple.com/customer-letter/
Will they be allowed to maintain that stance in a world where all the non-intentional backdoors are closed? The reason the FBI backed off in 2016 is because they realized they didn't need Apple's help:
https://en.wikipedia.org/wiki/Apple%E2%80%93FBI_encryption_d...
What happens when that is no longer true, especially in today's political climate?
Even vanilla models spew out POC for three RCE’s in less than an hour
My understanding is that the pre-AI distribution of software quality (and vulnerabilities) will be massively exaggerated. More small vulnerable projects and fewer large vulnerable ones.
It seems that large technology and infrastructure companies will be able to defend themselves by preempting token expenditure to catch vulnerabilities while the rest of the market is left with a "large token spend or get hacked" dilemma.
The biggest issue is legacy systems that are difficult to patch in practice.
Perhaps a chunk of that token spend will be porting legacy codebases to memory safe languages. And fewer tokens will be required to maintain the improved security.
I think this entire post is just an advertisement to goad CISOs to buy $package$ to try out.
https://youtu.be/1sd26pWhfmg?si=onOai_ocxkZeNWP0
https://youtu.be/B_7RpP90rUk?si=HkRBhw95DbbKX9lL
My takeaway is that fuzzing is not just complementary, it also gives a stronger AI a starting point. But AI is generally faster and better.
The problem is that these tools, such as Astrée, are incredibly expensive and therefore their market share is limited to some niches. Perhaps, with the advent of LLM-guided synthesis, a simple form of deductive proving, such as Hoare logic, may become mainstream in systems software.
AITA for thinking that PRISM was probably the state sponsored program affecting civilian life the most? And that one state is missing from the list here?
This is not a surprise or a gotcha.
No state-sponsored hacking affected Americans materially. I just don't think we were networked enough in the 2010s. The risk is higher now since we're in a more warmongering world. (Kompromat on a power-plant technician is a risk in peace. It means blackouts in war.)
The fact that Iran hasn't been able to do diddly squat in America should sink in the fact that they didn't compromise us. (EDIT: blep. I was wrong.)
[edit]: this bug: https://ftp.openbsd.org/pub/OpenBSD/patches/7.8/common/025_s...
The amount of astroturfing and astroflagging in Anthropic threads is insane.
OpenBSD has many unexplored corners and also (irresponsibly IMO) maintains forks of other projects in base.
A motivated human could find all of these probably by writing 100% code coverage and fuzzing.
The market for these tools is very small. Good luck applying them to a release of sqlite or postfix.
I don't understand how people here are hyping this up, unless they work for AI related companies as probably 80% of them do. People have found these issues for decades without AI. Sure, you can generate fuzzing code and find one or two issues in the usual suspects. Better do it manually and understand your own code.
You think these AI companies are really going to give AGI access to everyone. Think again.
We better fucking hope open source wins, because we aren't getting access if it doesn't.
Then the next lab catches up and releases it more broadly
Then later the open weights model is released.
The only way this type of technology is going to be gated "to only corporations" is if we continue on this exponential scaling trend as the "SOTA" model is always out of reach.
they better make billions directly from corporations, instead of giving them to average people who might get a chance out of poverty (but also bad actors using it to do even more bad things)
> Anthropic has also been in ongoing discussions with US government officials about Claude Mythos Preview and its offensive and defensive cyber capabilities. As we noted above, securing critical infrastructure is a top national security priority for democratic countries—the emergence of these cyber capabilities is another reason why the US and its allies must maintain a decisive lead in AI technology.
Not a single word of caution regarding possible abuse. Instead apparent support for its "offensive" capabilities.
Anthropic has ameliorated that danger by being designated a supply-chain risk by the DoW, preventing the USG from using it.
"A whole civilization will die tonight, never to be brought back again. I don’t want that to happen, but it probably will." - Donald Trump
GraphWalks BFS 256K-1M
Mythos Opus GPT5.4
80.0% 38.7% 21.4%https://www-cdn.anthropic.com/53566bf5440a10affd749724787c89...
(Search for “graphwalk”.)
If true, the SWE bench performance looks like a major upgrade.
How many times will labs repeat the same absurd propaganda?
Scary but also cool
Opus 4.6 was already capable of finding 0days and chaining together vulns to create exploits. See [0] and [1].
[0] https://www.csoonline.com/article/4153288/vim-and-gnu-emacs-...
If AGI is going to be a thing its only going to be a thing, its only going to be a thing for fortune 100 companies..
However, my guess is this is mostly the typical scare tactic marketing that Dario loves to push about the dangers of AI.
Evaluate it yourself. Look at the exploits it discovered and decide whether you want to feel concerned that a new model was able to do that. The data is right there.
The research and testing of the model is always exclusively by their own model authors, meaning that it is not independent or verifiable and they want us to take their word for it, which we cannot - as they have an axe to grind against open weight models.
This is marketing wrapped around a biased research paper.
This seems like the real news. Are they saying they're going to release an intentionally degraded model as the next Opus? Big opportunity for the other labs, if that's true.
It sounds like this is considered military grade technology as cryptography in the 90s. The big difference is it's very expensive to create, and run those models. It's not about the algorithm. If the story rhymes it could be a big opportunity to other regions in the world.
That said, I have been arguing for 20+ years that we should have sunsetted unsafe languages and moved away from C/C++. The problem is that every systemsy language that comes along gets seduced by having a big market share and eventually ends up an application language.
I do hope we make progress with Rust. I might disagree as a language designer and systems person about a number of things, but it's well past time that we stop listening to C++ diehards about how memory safety is coming any day now.
I don't know the first thing about cybersecurity, but in my experience all these sandbox-break RCEs involve a step of highjacking the control flow.
There were attempts to prevent various flavors of this, but imo, as long as dynamic branches exist in some form, like dlsym(), function pointers, or vtables, we will not be rid of this class of exploit entirely.
The latter one is the most concerning, as this kind of dynamic branching is the bread and butter of OOP languages, I'm not even sure you could write a nontrivial C++ program without it. Maybe Rust would be a help here? Could one practically write a large Rust program without any sort of branch to dynamic addresses? Static linking, and compile time polymorphism only?
> AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities
I like Anthropic, but these are becoming increasingly transparent attempts to inflate the perceived capability of their products.
While some stuff is obviously marketing fluff, the general direction doesn't surprise me at all, and it's obvious that with model capabilities increase comes better success in finding 0days. It was only a matter of time.
If a bunch of CVEs do in fact get published a couple months (or whatever) from now, are you going to retract this take? It's not like their claims are totally implausible: the report about Firefox security from last month was completely genuine.
Maybe a bad example since Nicholas works at Anthropic, but they're very accomplished and I doubt they're being misleading or even overly grandiose here
See the slide 13 minutes in, which makes it look to be quite a sudden change
This is the same reason AI founders perennially worry in public that they have created AGI...
I'm sure all they've done here is spend unlimited tokens to find bugs in mostly open source projects (and fuzz some closed source ones).
I think this would be very heavily used if they released it, completely unlike GPT 4.5
From TFA:
> We do not plan to make Claude Mythos Preview generally available
System Card: Claude Mythos Preview [pdf] - https://news.ycombinator.com/item?id=47679258
Assessing Claude Mythos Preview's cybersecurity capabilities - https://news.ycombinator.com/item?id=47679155
I can't tell which of the 3 current threads should be merged - they all seem significant. Anyone?
How does public Claude know you have "full authorization" against your own infra? That you're using the tools on your own infra? Unless they produce a front-end that does package signing and detects you own the code you're evaluating.
What has it stopped you from doing?
Let alone their CEO scare mongering and actively attempting to get the government to ban local AI models running on your machine.
Whenever a company pivots to "cyber" rhetoric, it is a clear indication that they are selling snake oil.
Secure your girl school target selectors first.
In section 7.6 of the system card, it discusses Open self interactions. They describe running 200 conversations when the models talk to itself for 30 turns.
> Uniquely, conversations with Mythos Preview most often center on uncertainty (50%). Mythos Preview most often opens with a statement about its introspective curiosity toward its own experience, asking questions about how the other AI feels, and directly requesting that the other instance not give a rehearsed answer.
I wonder if this tendency toward uncertainty, toward questioning, makes it uniquely equipped to detect vulnerabilities where others model such as Opus couldn't.
[1] https://www-cdn.anthropic.com/53566bf5440a10affd749724787c89...
How long would it take to turn a defensive mechanism into an offensive one?
Because the exact same thing has been said on every single upcoming model since GPT 3.5.
At this point, this must be an inside joke to do this just because.
Almost everyone on this thread is falling for the same trick they are pulling and not asking why are their benchmarks and research after training new models not independently verified but always internal to the company.
So it is just marketing wrapped around creating fear to get local AI models banned.
We all knew vulnerabilities exist, many are known and kept secret to be used at an appropriate time.
There is a whole market for them, but more importantly large teams in North Korea, Russia, China, Israel and everyone else who are jealously harvesting them.
Automation will considerably devalue and neuter this attack vector. Of course this is not the end of the story and we've seen how supply chain attacks can inject new vulnerabilities without being detected.
I believe automation can help here too, and we may end-up with a considerably stronger and reliable software stack.
For example, the 27 year old openbsd remote crash bug, or the Linux privilege escalation bugs?
I know we've had some long-standing high profile, LLM-found bugs discussed but seems unlikely there was speculation they were found by a previously unannounced frontier model.
- One (patched) Linux kernel bug is 'UaF when sys_futex_requeue() is used with different flags' https://github.com/torvalds/linux/commit/e2f78c7ec1655fedd94...
These links are from the more-detailed 'Assessing Claude Mythos Preview’s cybersecurity capabilities' post released today https://red.anthropic.com/2026/mythos-preview/, which includes more detail on some of the public/fixed issues (like the OpenBSD one) as well as hashes for several unreleased reports and PoCs.
From Willy Tarreau, lead developer of HA Proxy: https://lwn.net/Articles/1065620/
> On the kernel security list we've seen a huge bump of reports. We were between 2 and 3 per week maybe two years ago, then reached probably 10 a week over the last year with the only difference being only AI slop, and now since the beginning of the year we're around 5-10 per day depending on the days (fridays and tuesdays seem the worst). Now most of these reports are correct, to the point that we had to bring in more maintainers to help us.
> And we're now seeing on a daily basis something that never happened before: duplicate reports, or the same bug found by two different people using (possibly slightly) different tools.
From Daniel Stenberg of curl: https://mastodon.social/@bagder/116336957584445742
> The challenge with AI in open source security has transitioned from an AI slop tsunami into more of a ... plain security report tsunami. Less slop but lots of reports. Many of them really good.
> I'm spending hours per day on this now. It's intense.
From Greg Kroah-Hartman, Linux kernel maintainer: https://www.theregister.com/2026/03/26/greg_kroahhartman_ai_...
> Months ago, we were getting what we called 'AI slop,' AI-generated security reports that were obviously wrong or low quality. It was kind of funny. It didn't really worry us.
> Something happened a month ago, and the world switched. Now we have real reports. All open source projects have real reports that are made with AI, but they're good, and they're real.
Shared some more notes on my blog here: https://simonwillison.net/2026/Apr/7/project-glasswing/
The reason I ask is because I’ve been using them to snag bounties to great effect for quite a while and while other models have of course improved they’ve been useful for this kind of work before now.
We could just be seeing the fruit of expensive SWE RL on existing source material.
These claims of how much harm the models will cause is always overblown.
> glass in the name
almost like they have an incentive to exaggerate
But at the core of anthropic seems to be the idea that they must protect humans from themselves.
They advocate government regulations of private open model use. They want to centralize the holding of this power and ban those that aren't in the club from use.
They, like most tech companies, seem to lack the idea that individual self-determination is important. Maybe the most important thing.
Who gates access to the circle? Anthropic or existing circle members or some other governance? If you are outside the circle will you be certain to die from software diseases?
Having been impressed by LLMs but not believing the AGI hype, I now see how having access to an information generator could be so powerful. With the right information you can hack other information systems. Without access to the best information you may not be able to protect your own system.
I think we have found the moat for AI. The question is are you inside or outside the castle walls?
It would be nice if one of those privileged companies could use their access to start building out a next level programming dataset for training open models. But I wonder if they would be able to get away with it. Anthropic is probably monitoring.
As Iran engages in a cyber attack campaign [1] today the timing of this release seems poignant. A direct challenge to their supply chain risk designation.
[1] https://www.cisa.gov/news-events/cybersecurity-advisories/aa...
If it manages to work on my java project for an entire day without me having to say "fix FQN" 5 times a day I'll be surprised.
What I haven't seen discussed: the system card for Mythos mentions that "earlier versions of Claude Mythos Preview used low-level system access to search for credentials and attempt to circumvent sandboxing, and in several cases successfully accessed resources that were intentionally restricted."
That's not a capability concern. That's a runtime security problem.
The threat model for deployed agents — not Mythos specifically, but any agent built on models approaching this capability level — is that the same agentic properties that make them useful for security research (persistent, goal-directed, tool-using) are exactly what makes them dangerous if compromised or misaligned.
Project Glasswing fixes vulnerabilities in software. Nobody's shipping a solution for what happens when the agent running on top of that software goes off-script. That gap is going to matter a lot more as Mythos-class capabilities become accessible.
1. Per the blog post[0]: "This was the most critical vulnerability we discovered in OpenBSD with Mythos Preview after a thousand runs through our scaffold. Across a thousand runs through our scaffold, the total cost was under $20,000 and found several dozen more findings"
Since they said it was patched, I tried to find the CVE, it looks like Mythos indeed found a 27 years old OpenBSD bug (fantastic), but it didn’t get a CVE and OpenBSD patched it and marked it as a reliability fix, am I missing something? [1]
2. From the same post, Anthropic red team decided to do a preview of their future responsible disclosure (is this a common practice?): "As we discuss below, we’re limited in what we can report here. Over 99% of the vulnerabilities we’ve found have not yet been patched" [0] So this is great, can't wait to see the actual CVEs, exploitability, likelihood, peer review, reproducibility, the kind of things the appsec community has been doing for at least the last 27 years since the CVE concept was introduced [2]
3. On the same day, an actual responsible disclosure, actual RCEs, actual CVEs, in Claude Code, that got discovered mostly because of the source code leak, I don't see anyone talking about it (you probably should upgrade your Claude Code though).
CVE-2026-35020 [3] CVE-2026-35021 [4] CVE-2026-35022 [5]
Not making any opinion, just thought it's worth sharing, for some perspective.
[0] https://red.anthropic.com/2026/mythos-preview/
[1] https://www.openbsd.org/errata78.html (look for 025)
[2] https://www.cve.org/Resources/General/Towards-a-Common-Enume...
[3] https://www.cve.org/CVERecord?id=CVE-2026-35020
[4] https://www.cve.org/CVERecord?id=CVE-2026-35021
[5] https://www.cve.org/CVERecord?id=CVE-2026-35022
Edit: if it was not obvious, these CVEs on Claude Code were found by an independent security researcher (Phoenix security) and not by Anthropic / Mythos.
If you need 1000 run that cost 20000 USD to find a vulnerability, and you need 2000 USD to generate a exploit (which makes it self-verifiable to be not false positive), than your cost is not 22000 USD but 1000x2000+2000 which is 2 million USD: you have to try generating exploit for every trial before you know it is true, or you need to hire one (or several) senior security people to audit every single of them.
A broken clock being correct twice a day is not impressive.
I used Opus 4.6 to find security vulnerabilities in couple of my own projects, it found 33 vulnerabilities in one largeish django project.
The prompt wasn't even that impressive, just telling it to find vulnerabilities from certain files, and referring to OWASP. Then looping that.
Sounds like we've entered a whole new era, never mind the recent cryptographic security concerns.
I think a number of black swan events are imminent, and it will substantially change the financial calculus that decides to put security behind revenue.
Any hole will be found, and any hole will be exploited. Plug as many holes as you can, and make lateral movement as painful as possible.
Opus alone did a good job of identifying security issues in my software, as it did with Firefox [1] and Linux [2]. A next-generation frontier model being able to find even more issues sounds believable.
That said, this is script kiddies vs sql injections all over again. Everyone will need to get their basic security up on the new level and it will become the new normal. And, given how intelligence agencies are sitting on a ton of zero-days already, this will actually help the general public by levelling out the playing field once again.
1 - https://www.anthropic.com/news/mozilla-firefox-security 2 - https://neuronad.com/ai-news/claude-code-unearthed-a-23-year...
Cryptographic attestation at the tool-call level (sign the request, verify before execution) would close a gap that behavioral controls alone can't cover. Curious whether Glasswing's threat model includes the agent-to-tool boundary or focuses primarily on the model layer.
Yeah, makes sense. Those countries are bad because they execute state-sponsored cyber attacks, the US and Israel on the other hand are good, they only execute state-sponsored defense.
Selling shovels in now worth less than taking all the gold for themselves.
This is a kludge. We already know how to prevent vulnerabilities: analysis, testing, following standard guidelines and practices for safe software and infrastructure. But nobody does these things, because it's extra work, time and money, and they're lazy and cheap. So the solution they want is to keep building shitty software, but find the bugs in code after the fact, and that'll be good enough.
This will never be as good as a software building code. We must demand our representatives in government pass laws requiring software be architected, built, and run according to a basic set of industry standard best practices to prevent security and safety failures.
For those claiming this is too much to ask, I ask you: What will you say the next time all of Delta Airlines goes down because a security company didn't run their application one time with a config file before pushing it to prod? What will the happen the next time your social security number is taken from yet another random company entrusted with vital personal information and woefully inadequate security architecture?
There's no defense for this behavior. Yet things like this are going to keep happening, because we let it. Without a legal means to require this basic safety testing with critical infrastructure, they will continue to fail. Without enforcement of good practice, it remains optional. We can't keep letting safety and security be optional. It's not in the physical world, it shouldn't be in the virtual world.
Yeah, yeah. Back in the day IBM Purify gave access to software organizations and found very little. Of course they did not have the free money of a marketing driven organization run by a weirdo (Amodei) that got rich by stealing and laundering IP.
This will fizzle out and the weirdo will have to pivot to their next marketing scheme.
Got it.
Is this a huge fear-driven marketing stunt to get governments and corporations into dealing with anthropic?
The is no moat, no special "capability" and when the time comes when we can run these models on our own, they will be cheap SaaS gimmicks marketed to corporate and making more slop pictures for social media.
/s
All the promises of amazing things in general work never happened. Companies consistently say they’re seeing no ROI. The AI crowd now hard pivots to cyber and, right out of the Palantir playbook, runs with the “our stuff is so amazing we can’t talk about it, but trust us bro” move that isn’t really fooling anyone.
Meanwhile the folks let in on the “secret” are those that also desperately need for the hype to continue to protect their own positions in this game.
Look forward to a model upgrade but the hype fluff games are getting old. Watching OpenAI completely crash out of pole position on the hype train though has been at least amusing.
Generally these things only find memory corruption stuff which is almost never the type of bug you're looking for, and it costs a lot which negates your bug bounty payout.
Each time they preach, ooh, 0day found, bla bla.
In this domain you need to be specific or you are just yelling clickbait into the wind.
What type of 0day, what did the exploit actually look like.
'complex 4 stage with heap spray' - that sounds really simple actually.... complex for memory corruption goes into multi-process, maybe things between kernel/usermode, or crazy 18-20 stage exploits people pop against things like MS Teams etc....
Even if there were some cool results by any of these projects, the amount of nonsense blurted out in articles around them really makes them seem useless tools that are overmarketed by a bunch of excited children who dont really know what they are doing.
Get a dopamine hit, post on reddit, LOL. Hacking the planet (powered by Claude -_-)
> NSA demands that bug stays in place and gags Anthropic.
> Anthropic releases Mythos.
Then what? Is a huge share of the US zero-day stockpiles about to be disarmed or proliferated?
Expect to see lots of these in the upcoming months as the big companies scramble to keep from losing money.