Mythos Finds a Curl Vulnerability (opens in new tab)

(daniel.haxx.se)

686 pointsTangerineDream4d ago281 comments

281 comments

Quote:

"My personal conclusion can however not end up with anything else than that the big hype around this model so far was primarily marketing. I see no evidence that this setup finds issues to any particular higher or more advanced degree than the other tools have done before Mythos. Maybe this model is a little bit better, but even if it is, it is not better to a degree that seems to make a significant dent in code analyzing."

It's a good reminder for us all that the competition in this space is rough and lots of more or less subtle marketing is involved.

vidarh4d ago

It may well be that the hype was primarily marketing.

The other alternative is that Curl is simply secure enough that there was far less to find than in other projects.

h1fra4d ago

They might be biased by the fact that curl is significantly more secure than the average software

coldtea4d ago

>It's a good reminder for us all that the competition in this space is rough and lots of more or less subtle marketing is involved.

About as subtle as a personal injury lawyer's billboard

steve19774d ago

Better Call Dario

te_chris4d ago

A thankfully American reference

greendude294d ago

I'd go out and say the marketing is not subtle. The hype and fanboys/girls are so in line with the marketing that any level of skepticism is seen a an act of defection, but if you look at the words, hyperbole and volume that is used, there is nothing subtle about it.

It's almost Trump-esque - "this model will change everything forever; we are doomed; we are saved; we will all be fired; we will all be rich", etc

xantronix4d ago

That's a pretty good encapsulation of the parallels between the political and the technological: One necessarily thrives upon the other and are inextricable. This moment is a culmination of all the disenfranchisement the bodypolitik have suffered, looking for any possible means of escape or elevation. AI and Trumpism, for their own respective cohorts, are salvation, on offer by different frontmen but ultimately in service of the same system.

They need the hype to pay off way more than we do. So many of us who still write code directly stand to lose nothing of our capabilities if the marketing claims cannot hold water.

ehnto4d ago

I seem to be totally outside the hype bubble, but I have to suspect there is a lot of imagineering and wild extrapolations in the elss technical hype bubbles. I am curious but no enough to go looking.

apexalpha4d ago

> An amazingly successful marketing stunt for sure.

This. Well done by Antropic.

It even reached the CISO of my small semi-government org in the Netherlands, who slightly panicked at the announced 'tsunami' of vulnerabilities that was coming with Mythos.

Got us some more money and priority with the board, though.

Never waste a good marketing scare.

fpesce3d ago

I don't agree with the "no tsunami in sight": if you don't look at 100+ bugs in Firefox and many more OSS projects, bunch of old unseen-before OpenBSD/Linux RCEs, and a few LPE in just 2 or 3 weeks for Linux itself...

IMO, this does not sound like marketing scare, there is spike of vulnerability disclosures - high quality, low false positives - that can be sensed... It feels like we're speedrunning through few-years worth of high quality bug reports in just a few weeks.

Aurornis3d ago

Mythos isn’t released yet.

Anthropic noticed the trend of AI vulnerability scanning and started advertising Mythos, which is unreleased, as being very good at it.

Then they donated very large token budgets for using Mythos privately to several teams. Those teams used the free token spend for security research (that was the deal) and anything they found got attributed to Mythos, not the token budget.

Mythos looks like a good incremental model but the PR team has done a great job of associating themselves with the current trend. So much so that comments like yours already associated vulnerabilities found with this model which isn’t even available yet

jerf3d ago

Mythos hasn't been released yet, but there seems to be some evidence that GPT-5.5, which has been released, is already a touch better anyhow in some dimensions: https://www.mindstudio.ai/blog/gpt-5-5-vs-claude-mythos-cybe...

Close enough that you can probably get a good sense of Mythos' performance by using GPT-5.5.

One thing I noticed while using GPT-5.5 for this is that the ability of the model to turn the bug into an outright vulnerability is less relevant than you might intuitively think. All that is really necessary is for the model to point out that something is smelly, and you should just fix it. Turning it into a runnable exploit has very limited utility for the defender. It does turn heads and may get the attention of some otherwise reluctant people, but everything I found was obviously enough wrong that the exploit was just decorative.

1 more reply

apexalpha3d ago

The LPEs were not found with Mythos but with existing, publicly available models.

stingraycharles3d ago

And also: they did an earlier run with Opus to discover bugs (like segfaults).

In February, Opus discovered a whole bunch of security related bugs, but didn’t exploit them.

Mythos, in turn, was fed these bugs and told to exploit them.

Not saying it’s not impressive, but it was literally told “here are all the places our metal detector says there may be gold, please find gold”.

1 more reply

yjftsjthsd-h3d ago

> bunch of old unseen-before OpenBSD/Linux RCEs,

AFAIK, the only thing it found in OpenBSD was a DoS?

Edit: For that matter, I'm not aware of RCEs in Linux, only LPE?

fpesce2d ago

The whole thing started with a talk from Nicholas Carlini mentioning a remote 20+ year old NFS vuln IIRC.

helloplanets3d ago

Anthropic has is quickly destroying customer goodwill by repeatedly pulling the same stunt. Horrible marketing, imho.

It's an entirely different thing to have the company conduct research on LLMs in general being a cybersecurity threat, instead of going "our new model is just too powerful" and shift the discussion to revolve around that. It's slimey.

AgentME3d ago

Hasn't almost every new frontier model had an early period of limited access? I don't get why everyone is acting like Mythos is particularly egregious for this.

helloplanets3d ago

This is literally how they announced the model:

> We formed Project Glasswing because of capabilities we’ve observed in a new frontier model trained by Anthropic that we believe could reshape cybersecurity.

> Claude Mythos Preview is a general-purpose, unreleased frontier model that reveals a stark fact: AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities.

https://www.anthropic.com/glasswing

bamboozled3d ago

It is called "Mythos" dude...do you have any idea how mysterious and scary this sounds to most people and how much hype that alone can generate.

If the model was calle "Mini Mouse" it wouldn't feel anywhere near as threatening and interesting.

It sounds like the name of a cologne from the 70s or something and I like it.

aswegs83d ago

The bar has become so low lately that no one will care.

AlexCoventry3d ago

He describes in detail how curl is software-engineered to within an inch of its life. Do you really think most code is that highly polished?

Valakas_3d ago

Well done for convincing people of something that isn't true, in other words, well done for lying? Is this what is being cheered on? Seriously.

markus_zhang3d ago

org head is smart.

EMM_3863d ago

If an AI agent finds zero bugs in a software utility, how can that be viewed in the sense the AI agent is not very good at finding bugs?

What if there are actually zero bugs?

> Five issues felt like nothing as we had expected an extensive list.

The expectation here may not match reality, but not necessarily because Mythos isn't as capable as claimed. curl may just happen to be a well-hardened tool that doesn't have too many security vulnerabilities in its present state.

zamadatix3d ago

The author considered the same w.r.t. remaining bugs:

> More to find

> These were absolutely not the last bugs to find or report. Just while I was writing the drafts for this blog post we have received more reports from security researchers about suspected problems. The AI tools will improve further and the researchers can find new and different ways to prompt the existing AIs to make them find more.

> We have not reached the end of this yet.

> I hope we can keep getting more curl scans done with Mythos and other AIs, over and over until they truly stop finding new problems.

And that makes sense, it'd be quite the argument of coincidence to say there was just 1 proper find remaining & it was only Mythos that managed to find it just at the point in time it released while the other projects have been hoovering up every other find quickly until that point. Possible, but not the safest assumption to start questioning with.

yjftsjthsd-h4d ago

> Not particularly “dangerous”

I'm not sure that follows. As noted, curl was already analyzed to death with every tool available; most software isn't at that level.

anygivnthursday3d ago

But Mythos is not marketed as a tool that can do the same as other tools already available maybe slightly better, but as a revolution.

abirch3d ago

I'm agnostic with Anthropic/Mythos but if there aren't any vulnerabilities there it's hard to find it.

Until we find vulnerabilities in curl that Mythos missed, it's hard to say how good it is.

nicce3d ago

Would have like to see analysis against curl repo where the commit level is one day after the Mythos training data cutoff. And disable access to the internet.

billyoneal2d ago

No. The goal posts are not disproving their marketing. The burden of proof is on the people doing the marketing.

matltc3d ago

Mythos is either dangerous or not. We are taking dangerous to mean that the number of vulns it finds will be much greater than bugs found with available tools.

Since mythos found only one additional vuln, and since x+1 is not much greater than x, it follows that mythos is not dangerous per the definition above.

skybrian3d ago

It doesn’t follow because the results for curl don’t necessarily generalize to other codebases. It’s evidence against Mythos being particularly dangerous, but it’s just one datapoint.

It doesn’t invalidate the other security bugs Mythos allegedly found in other codebases.

bilekas4d ago

I don't think I understand what you mean, the "not particularly dangerous" comment was in relation to the vulnerability that was found right ? Surely they would know what constitutes a lower severity level.

vidarh4d ago

The "not particularly dangerous" is a headline for a section talking about Mythos, not the vulnerability.

bilekas4d ago

Ah okay, that makes a bit more sense. I read it wrong. Then the comment is absolutely fair.

Ekaros4d ago

My guess is that it is in category of "you are holding it wrong". Still worth fixing, but requires very specific user input for example. Or very weird scenario. Or in some less used protocol or flag combination.

croon3d ago

Sure, but isn't it a verdict on Mythos compared to other models?

If so, it would still follow. "Most software" isn't analyzed as much as curl, by either other tooling or other models, that might well find close to the same as Mythos did. As such, Mythos then isn't especially/particularly dangerous.

Sharlin3d ago

Curl is currently receiving a record number of high-quality bug/vuln reports (a rather sharp change from the earlier slop inundation), so it’s not like there’s nothing to find. Many or most of these are presumably found by human experts assisted by AI tools, but if Mythos were truly revolutionary, it should be able to find such issues on its own.

https://daniel.haxx.se/blog/2026/04/22/high-quality-chaos/, linked from TFA

galangalalgol3d ago

Is there a list of infrastructure that has received this kind of focus? Clearly people are looking at the linux kernel, hopefully openssl?

snackerblues2d ago

From article:

> I did a quick unscientific poll on Mastodon to see if other Open Source projects see the same trends and man, do they! Friends from the following projects confirmed that they too see this trend. Of course the exact numbers and volumes vary, but it shows its not unique to any specific project.

> Apache httpd, BIND, curl, Django, Elasticsearch Python client, Firefox, git, glibc, GnuTLS, GStreamer, Haproxy, Immich, libssh, libtiff, Linux kernel, OpenLDAP, PowerDNS, python, Prometheus, Ruby, Sequoia PGP, strongSwan, Temporal, Unbound, urllib3, Vikunja, Wireshark, wolfSSL, …

srcreigh3d ago

I can't help but think that curl is, by nature, a relatively simple and well-contained tool. Compare to an operating system or web browser or database or billion dollar company codebase.

It makes some sense that Mythos/ChatGPT 5.5 might be that much better with complexities that curl just doesn't have because it's a basic tool.

Like yeah curl is obviously extremely fully featured as an "anything client" but it's orders of magnitude less complex than other software we rely on.

sausagefeet3d ago

Curl is a lot more complicated than, I believe, you think. Most people know of it simply as a CLI to hit an HTTP(S) endpoint and write it out. But:

1. It supports basically any file transfer protocol.

2. It is a library that is designed for long running processes.

3. Because it's designed for long running processes, it makes use of every trick it can to pipeline and re-use connections and resources.

4. It has an asynchronous API so it can be integrated into any existing event loop.

Is a web browser or database more complicated? Most certainly, they solve really massive problems. But curl is certainly more complicated than probably most application code that uses it.

joelthelion3d ago

I agree it's rather basic but as stated in the article, its code is still longer than war and peace. There is still plenty of opportunities for security vulnerabilities in something of that size.

breakpointalpha3d ago

From the post:

"curl is currently 176,000 lines of C code when we exclude blank lines. The source code consists of 660,000 words, which is 12% more words than the entire English edition of the novel War and Peace. ... curl is installed in over twenty billion instances. It runs on over 110 operating systems and 28 CPU architectures. It runs in every smart phone, tablet, car, TV, game console and server on earth."

I wouldn't call that simple or well contained...

Most OS or web browsers don't run on cars or tvs.

andriy_koval3d ago

someone(mythos?) should write some simple-curl with 20% of features implemented in rust used by 98% of users.

bilekas4d ago

> The single confirmed vulnerability is going to end up a severity low CVE planned to get published in sync with our pending next curl release 8.21.0 in late June

My mind still cannot understand the quality and refinement that's gone into cURL. It really is the perfect example of something done so right, that people barely think twice about.

pjmlp3d ago

Easy, it shows what is achievable if there is a high bar for quality in every single line of code that gets commited, reviewed and merged, regardless of the programming language.

However in the days of race to bottom, offshoring for penies, and now LLM powered code generation, this is a quality most companies won't care unless there is liability in place.

bilekas3d ago

> Easy, it shows what is achievable if there is a high bar for quality in every single line of code that gets commited

This is becoming a more and more overlooked/underrated feature. I genuinely believe it would be impossible in any company that depends on shareholder value. I am yet to convince any company I've worked in without bloody hands that we need to solve old tech debt and refactor certain things etc.

pjmlp3d ago

Which is liability is relevant, that is the only language shareholders understand.

1 more reply

dotancohen4d ago

Curl and SQLite are my favourite examples of properly engineered, rigourously tested _anything_. It's really philosophical - those projects' contribution requirements demand such rigor, and the maintainers stand by that demand. A non-load-bearing document (not project code) is what makes that possible - very reminiscent of Einstein's thought experiments leading to tangible projects such as GPS or Descartes's belief that all problems can be solved through rational thinking.

ontouchstart3d ago

Some people must be working on training some models exclusively on high quality OSS code base like curl and SQLite without the noise of low quality training data.

I would do that with 100% local models from scratch.

TacticalCoder3d ago

> My mind still cannot understand the quality and refinement that's gone into cURL. It really is the perfect example of something done so right, that people barely think twice about.

And all that to then end with people doing: "curl ... | bash" and not seeing anything wrong about it. Then they'll deflect about "threat models" and other non-sense.

I leave you your curl-bash, I keep my cryptographically signed packages installer.

patrickmeenan3d ago

As far as I can tell, the messaging around Mythos is that it takes the expertise of the top security experts and top-level language, protocol and code experts and makes that available to anyone with access. The danger was in giving that access to the world before the defenders had access to that level of expertise.

Curl HAS had security, protocol and language experts poking at it for years because of how central it is to everything. That Mythos found anything is interesting but not a sign that it's been marketing hype and isn't dangerous.

You can bet that 99.99% of projects aren't nearly as secure as curl and it doesn't matter if they are open or closed source (LLM's will happily decompile closed-source projects and explore). Unless your project has been fuzzed and gone over with existing AI tooling and by experts, expect that it can already be hacked - even with the tooling that is out there now and that something like Mythos makes it accessible for an even wider population pool with less expertise to use.

2001zhaozhao3d ago

Take my upvote. Anthropic never claimed superhuman performance, only speed and scale. That it doesn't find much in terms of new vulnerabilities in a well-studied piece of software says nothing about its overall potential for dangerous misuse.

jrflo3d ago

I know that the Mythos hype is part marketing by anthropic, but isn't it possible that with a highly scrutinized codebase, there just aren't any notable security exploits in it's current state? The fact that it found nothing isn't necessarily an incrimination against it, especially when other tools had identified hundreds of exploits previously. Seems like it's been completely picked over (for now).

1 more reply

AntiUSAbah4d ago

There is always marketing involved and people should be able to put marketing into perspective.

Also curl in this regard is a open source project, relativly small but critical, well known and used everywhere. Besides image libraries, tools like curl or sudo, su, passwd, etc. would also be my first try.

Mythos is still not known at all what it can do. What does it mean from cost and benchmark pov to have a 10 Trillion parameter model?

Nonetheless, the fact that LLMs got significant better in finding this, better than humans, started to happen half a year ago? so at one point we need to address the elefant in the room and state that today you need to do security scanning additional with LLMs. You need to take this serious.

In worst case, use Anthropics marketing to state that its a must now and something changed.

Tade03d ago

> What does it mean from cost and benchmark pov to have a 10 Trillion parameter model?

To me it means that we've hit the top end of the S-curve with regards to effects of scaling - if the tool isn't remarkably better despite the scale, then we're firmly in diminishing returns territory.

u_fucking_dork3d ago

> Mythos is still not known at all what it can do.

And this is very much on purpose my friend. Think about what people already believe it can do though.

flohofwoe3d ago

> Nonetheless, the fact that LLMs got significant better in finding this, better than humans, started to happen half a year ago?

*rolls eyes* regular static analyzers also have been "better than humans" for decades, being better than a human at a specific mechanical task really doesn't mean much. The interesting new thing is the type of potential "fuzzy bugs" described in the article that LLMs are able to identify (a comment not matching the code it describes, uncommon usage of a 3rd party library, mismatch of code and a protocol it implements, or often just generally weird looking code somebody should have a closer look at... this closes a gap in the traditional debugging toolboxes, but shouldn't replace them)

AntiUSAbah3d ago

You don't have to dismantle a comment on a microlevel.

It has been clear for ages that certain type of bugs or issues are better solved from software.

But there was still plenty of things a proper SecOps Person would be able to find with help from tooling which automatic tooling wouldn't find.

Taking a limited amount of resources and focusing on the critical things.

I do think this is gone now. Same with Threat modeling etc.

pixl973d ago

Static analyzers are balls. For every real bug they find you are dealing with with piles of false positives and negatives.

Now, I'm not saying you shouldn't use them. They do catch the low hanging fruit. It's that LLMs actually have a much better understanding of things like intent when looking at your code and general architecture configurations that can lead to problems.

As you say we've had static analyzers forever, hence why they aren't dropping out 50 new CVE's a day. LLMs are. There is a massive stack of software out there that is getting analyzed and exploited at a rate faster than it's getting patched. Adding to that things like NPMs exploited package of the day and popular github repository takeovers this year looks massively different from last year in quantity and quality of exploits alone.

flohofwoe3d ago

IME LLMs generate at least as much false positives as static analyzers, but they're good at catching entirely different types of problems than static analyzers. 99% of false positives are avoided with a proper assert hygiene, and from what I've seen that seems to be true both for traditional static analyzers and llms, those assert annotate the code with valuable hints that may go beyond a specific type system's capabilities.

ahofmann4d ago

Putting on my tinfoil-hat: Sooo, the guy who runs the test and delivers the report could just have removed the more interesting bugs and delivered those to any three letter agency?

bilekas4d ago

No, based on cURL's history, it really seems like they would love to have found a really novel bug. Now if it was a for profit company.. Tinfoil hat would be shared!

AnssiH4d ago

The test was run by an unnamed third party, so cURL's history has no relevance to their benevolence.

Ekaros4d ago

Curl is likely one of the very much more combed over pieces of code at this point. It feels like it has some special draw for people looking for vulnerabilities. Not that it doesn't mean some novel idea can't be looked or checked still.

cakealert4d ago

> No, based on cURL's history, it really seems like they would love to have found a really novel bug.

You just confirmed that you didn't read the article.

"Eventually, I was instead offered that someone else, who has access to the model, could run a scan and analysis on curl for me using Mythos and send me a report."

bilekas4d ago

I'm not sure how that proves I didn't read the article ?

NitpickLawyer3d ago

What's going on in this thread? It's weird how prevalent the negativity towards mythos is, and I'm not sure if it's people throwing the baby out with the bathwater or something more tinfoil-adjacent coordinated campaign. I also noticed this on a thread a few days ago, before the mozilla post. There were dozens of comments saying basically "mythos is vaporware".

I get the idea that they're using it for marketing. Of course they are. But to reduce it at "just marketing" feels either ill informed or outright wrong. Unless you have reasons to not believe the dozens of credentialed, well respected people in the field that have already shared their opinions after working with mythos. Plenty of them on all the social media sites.

And then there's the team at mozilla. They wrote a blog about this, and they've worked with anthropic before, using opus 4.6 and found and fixed 22 vulnerabilities. Then they worked with mythos and found and fixed 271 vulnerabilities. Unless you're going to accuse them of being shills, these are unquestionable numbers. The model is quantitatively better at this thing. And it matches what everyone is saying.

I think there are better things to accuse anthropic of, than that they are simply lying for marketing purposes. Of course they'll use this as a marketing campaign, but there's plenty of evidence out there that there is something there, that the model is simply better than previous generations at this. Don't fall for the cheap reductionist stuff, just because you don't like them, or feel that this is marketing fluff. It doesn't feel like a gimmick, even if it gets used to push their agenda. Something, something, propaganda often uses true statements as well.

countWSS3d ago

Here and on reddit, AI debugging is viewed as some weird shallow pattern-matching that obviously fails to spot real stuff and overload the maintainers. Instead of getting to "spotless record" of zero flaws, the people start rationalizing that "X is not a real bug" and inventing justifications for their(obviously bad) code, which is critique they can't accept from AI, only through human debate that they can't close with a WONTFIX. Once the bug is actually usable, the tune changes completely.

mschuster913d ago

> Here and on reddit, AI debugging is viewed as some weird shallow pattern-matching that obviously fails to spot real stuff and overload the maintainers.

That's because that is what a lot of people did in the last years [1] to pad their resumes or to force developers to backport patches to older (but supported) kernel versions that wouldn't have gone in if they didn't have a CVE attached [2]. Maintainers have been legitimately swamped with low-quality spam for a very long time. Only recently, in the last few months, AI actually got "good enough", the problem is that maintainers still have to differentiate between AI slop by wannabes and by AI-assisted reports reviewed and refined by actual human professionals.

[1] https://www.zdnet.com/article/how-fake-security-reports-are-...

[2] https://opensourcewatch.beehiiv.com/p/linux-gets-cve-securit...

pixl973d ago

At the end of the day attackers don't give a fuck. "Waaa waaa, AI was bad 6 months ago so I'm going to throw a little fit" doesn't work when it's currently actively exploiting your shit. No one gives a damn if there are 4000 bullshit security PRs lined up. The one real RCE in there mean that everything you hold dear has already been carted off by nation states, and probably rediscovered by 3 or 4 other exploitation groups by this point.

It's time for all the little snowflake software writers to pull up their pantaloons and realize that Linus' vision has become real. With enough AIs all security bugs become shallow. And that software affects the real word, real money, and real people in it. That they are also under attack by well financed groups with rather evil motivations. If I'm attacking some group using your software (such as another nation) I'm going to flood the fuck out of your PR system till you give up hope and die. I'm going to make you attack your contributors. I'm going to sow confusion so I have the maximum amount of time to lay waste to my enemies and profit to the max.

The internet is hostile. Software is hostile. There are sharks looking to eat you.

Time to face that fact.

u_fucking_dork3d ago

> And then there's the team at mozilla

And then there’s the team at curl. Don’t fall for the cheap marketing stuff just because you like them

Everything points to Mythos being marginally better and nobody being able to afford to run it.

addedGone3d ago

Many can afford it, many companies are easily spending 15-30K a month in tokens per staff.

ZrArm3d ago

> Unless you have reasons to not believe the dozens of credentialed, well respected people in the field that have already shared their opinions after working with mythos.

Exactly the same argument was made about o3-preview, lol. But anyway, do they talk about all domains where Mythos did the leap in capabilities (math and other research, ML, SWE) or only about cybersec?

> And then there's the team at mozilla. They wrote a blog about this, and they've worked with anthropic before, using opus 4.6 and found and fixed 22 vulnerabilities. Then they worked with mythos and found and fixed 271 vulnerabilities

Those 22 bugs were found in February, at the time when Mozilla were doing first small-scale experiments with Opus 4.6 (i.e. no proper integration into workflow, likely relatively simple harness, likely only small part of codebase was covered). You can't compare "22 bugs which were found during very early attempts to apply AI" and "271 bugs which were found during large-scale codebase scanning with properly configured AI". The fact that Mozilla is pretty vague about "contribution of other AI models" makes it even worse.

> Unless you're going to accuse them of being shills, these are unquestionable numbers. The model is quantitatively better at this thing

They found another ~150 bugs after their first announce, and only like ~35 were found by Mythos. It's already very sharp drop in contribution.

> I think there are better things to accuse anthropic of, than that they are simply lying for marketing purposes.

Anthropic already used a lot of "technically correct but in fact deceiving" statements in Mythos system card. They are playing both "It's too dangerous" and "We don't have enough compute for that super model" at the moment (it's usually a big red falg). Opus 4.7 (which was likely supposed to be "Opus 5.0", given various facts) is a disaster from various points of views. Of course people don't really believe Anthropic.

doctorpangloss3d ago

HN readers really like the idea of Daniel, he's sticking it to the man... by doing free publicity and work for the man haha

hmokiguess3d ago

> curl is one of the most fuzzed and audited C codebases in existence (OSS-Fuzz, Coverity, CodeQL, multiple paid audits). Finding anything in the hot paths (HTTP/1, TLS, URL parsing core) is unlikely.

The way this reads sounds more like the LLM dismissed trying rather than it tried and failed, I've seen Claude do that often unless I probe it to challenge itself, curious here what actually happened.

nevi-me3d ago

> These tools and the analyses they have done have triggered somewhere between two and three hundred bugfixes merged in curl through-out the recent 8-10 months or so.

If you've just gone through a lengthy analysis of your code with other AI tools, surely it's reasonable not to expect to see hundreds more from a new tool?

It should be possible, unless more bugs are introduced, to eventually get to a state where there are no more bugs in your code.

Process aside, it sounds like Daniel expected to find dozens/hundreds more bugs.

jaapz3d ago

Mythos was kind of hyped as the tool that would discover much more bugs than any currently available tool

pbmonster3d ago

curl had ~15 CVEs in 2026 so far. You surely don't think those (and the one Mythos found) were the last security bugs still left in the code base? There certainly will be more, in fact Daniel predicts ~50 CVEs for the entire year.

But Mythos found 1. After all that hype. 1.

knowaveragejoe3d ago

Maybe curl is just... better hardened? Firefox posted hundreds in April.

tgtweak3d ago

I feel like, if it was a codebase without using any security analysis tools, there would have been some more significant findings - perhaps they can re-run it on an 18 month old commit and see how many it found that were subsequenty found and fixed?

Anyway, I think the case that frontier and next-gen models will get increasingly adept at finding vulnerabilities and that those on the receiving end of those vulnerabilities need to be on top of it.

ostif-derek2d ago

Unfortunately that doesn't help much. LLMs are really really good at digging up known vulns, so much so that they often falsely declare known vulns as new and novel ones.

They have the CVEs in their training data, know how to look up ossfuzz logs, etc.

andromaton3d ago

If priced like other Anthropic models, Mythos will make vulnerability discovery a lot more accessible.

The author compares it to AISLE, ZeroPath, and OpenAI’s Codex Security. AISLE and ZeroPath are much more expensive. OpenAI’s Codex Security is gated.

Most people don't care about the first two and don't complain about the latter's policy because they are all specialized models and/or harnesses.

Mythos will be available to all.

vibedev9993d ago

> AISLE and ZeroPath are much more expensiv

AISLE is *cheaper* for sure

mohsen14d ago

I don't know about Mythos but in recent weeks I've noticed Opus is constantly failing to fix things in tsz[0] vs GPT 5.5 can easily churn out fixes that are solid and pass tests. I've stopped paying for Claude for now and all my money is going to OpenAI at the moment. Either Opus is massively nerfed or GPT 5.5 is really head and shoulder higher in terms of very difficult tasks. The last percent of conformance tests in tsz are really really difficult and I've seen Opus bailing again and again. So annoying to waste time and tokens to finally get "this is too involved" or "this requires a multi-week sprint to fix".

[0] https://tsz.dev

_pdp_4d ago

The new Opus feels like a step backwards. More expensive, thinks more, and it does not get the job done.

vincent_s3d ago

From a user’s perspective 4.7 is a downgrade compared to 4.6 . It’s intended to give Anthropic more control about their compute resources and profitability:

https://news.ycombinator.com/item?id=48072916

dyauspitr4d ago

Having never used Claude and only Codex, does Claude actually say “this is too involved” as a response to a prompt?

mohsen14d ago

Yes it does. Usually after hours of working and not getting results

redditor986543d ago

I am curious, what kind of work do you use Claude for that sometimes requires hours of working. In my case, I have never seen it go off for more than 10 mins and even that is very rare.

jorisw3d ago

Love Daniel's writing style here. Fact based, concise, easy to read

ilia-a3d ago

IMHO Mythos was more of a marketing ploy.

When it comes to security and AI, all top tier publicly accessible models (GPT 5.5, Opus 4.7) and even near-top like Deepseek 4 PRO can do a very good job given detailed harness on how to spot issues and cross-validate them to avoid false positives.

romaniv3d ago

"I signed the contract for getting access, but then nothing happened. Weeks went past and I was told there was a hiccup somewhere and access was delayed.
Eventually, I was instead offered that someone else, who has access to the model, could run a scan and analysis on curl for me using Mythos and send me a report. To me, the distinction isn’t that important."

Really? We're talking about (essentially) a product demo from a trillion dollar industry fueled by debt. Clearly, blog posts like this have an immense influence on the perception of usefulness of the particular model and AI in general. With so much staked on this for the company, wouldn't you want to be sure that you're using the actual product without anyone messing with the results in any way?

Semkas3d ago

I'm disinclined to be overly generous to Antrophic, but I have to say that regardless of whether the talk of Mythos being uniquely dangerous was mostly cynical: It would be great if this starts a trend of giving security-critical software a few months head start with any new significantly improved model.

tuananh3d ago

Interesting. curl team found that Mythos is mostly hype while Calif team found Mythos amazing.

I would think Calif (a security firm) is a better team to better utilize such tool.

vb-84483d ago

I guess we miss fundamental information: how much in terms of time and token usage took to the "middle guy" to create the report?

Next question: could it be that OP can use Mythos in a better way since he knows better the project?

absynth4d ago

I routinely used to compile C programs on other compilers to find defects that one or another didn't find. Compiling on Windows vs Linux. You could summarize / minimize it down to compiling it with warning as errors etc but you'd be missing the point.

The point wasn't actual cross-platform portability even though that was a nice side effect. It was to flush out all the weird edge cases.

Edges like security flaws. Buffer overflows are usually platform specific. There are plenty of other ways to find these issues but simply recompiling for a different platform surfaces all sorts of issues.

tedd4u3d ago

It's also a convenient way to get press (and investor valuation) for a new model with releasing it (word is they don't have enough hardware to do so).

23aqsI2d ago

Like clockwork, criticism of the Alpha Omega apparatchiks is flagged. They know how to protect their income streams while open source authors get nothing.

yjftsjthsd-h4d ago

> The source code consists of 660,000 words, which is 12% more words than the entire English edition of the novel War and Piece.

Typo, or is there a spoof I should go read?

dotancohen4d ago

Perhaps he was dictating.

Does it say anything else? Just 'Aaaarggghhhh'?

Hamuko4d ago

Doubt it considering that Daniel Stenberg is Swedish. English dictation when you speak English as a second language with an accent is quite annoying.

Tistron4d ago

Voice input works really well for people speaking English with a Swedish accent. I think the accent of most educated Swedes is mostly a case of prosody. For sure there are some sounds we say slightly differently than native English speakers. We often have some trouble with /s/ and /z/, but I don't know, "war and peace", I think that's easily understood.

Source: voice typing this with Swedish vocal chords, and only had to correct "different lives" to "differently", and add /[^\w\s]/.

iso16314d ago

War and Peace is about 590,000 words. Tiny compared to the full Harry Potter collection (about 1 million words over the 7 books), but long for a single book.

perching_aix4d ago

They're referring to the typo in the title, "Piece" vs "Peace".

I also thought they were contending the word count before noticing. Even remarked how I find this a weird metric, given that code is not prose [0], but then I deleted that once I picked up on what's going on.

[0] comparing the output of `wc -w` with the word counts of books I'm reasonably sure will be super off

plexescor3d ago

I personally belive its a marketing stunt and they are just using actual humans to find the bugs/vulns

theaniketmaurya3d ago

Who is using Mythos to find these things and where do they run it?

jedisct13d ago

Swival found many more vulnerabilities without Mythos https://github.com/swival/security-audits

nottorp3d ago

> (I am purposely leaving out the identity of the individual(s) involved in getting the curl analysis done as it is not the point of this blog post.)

I would very much like to know if they were independent or affiliated to Anthropic.

> My personal conclusion can however not end up with anything else than that the big hype around this model so far was primarily marketing.

... because of this.

brunoborges3d ago

AI not finding a security issue on cURL has more to do with lack of widespread security issues than the model's capacity of finding them.

AtNightWeCode3d ago

Should have scanned it with Mythos on an older code base before all these other sec issues was resolved with other tools. Or use the other tools to introduce the same kind of errors in other parts of the code base to see if Mythos would have found it.

A problem is that these tools seems smarter than they are cause they already read seen the answer key.

1 more reply

readthenotes13d ago

Kinda burying the lede: AI tools found over a dozen CVEs in curl last year, and hundreds of bugs.

"Primarily AISLE, Zeropath and OpenAI’s Codex Security have been used to scrutinize the code with AI. These tools and the analyses they have done have triggered somewhere between two and three hundred bugfixes merged in curl through-out the recent 8-10 months or so. A bunch of the findings these AI tools reported were confirmed vulnerabilities and have been published as CVEs. Probably a dozen or more."

toraway3d ago

Not exactly "burying the lede" since Daniel already posted an update about it months ago [1] with extensive discussion in numerous of articles [2] including on this site [3].

[1] https://lists.haxx.se/pipermail/daniel/2025-September/000127...

[2] https://www.theregister.com/software/2025/10/02/curl-project...

[3] https://news.ycombinator.com/item?id=45449348

jongjong3d ago

I'm looking forward to trying Mythos run against my 5000-line, instant-finality, quantum-resistant blockchain project and decentralized exchange (an additional 5000 lines). I already ran all the models up to Opus 4.6 and they couldn't find anything.

utopiah3d ago

Won my bet "voted 10 [vulnerabilities] but in retrospect as you are familiar with Claude and such tooling if you already used any of recent model to done some kind of security review then I'd drop to 1 or even 0." https://mastodon.pirateparty.be/@utopiah/116537456780283420

perching_aix4d ago

It's a shame he seems to reject the idea of actually diving in and using these tools interactively:

> It’s not that I would have a lot of time to explore lots of different prompts and doing deep dive adventures anyway.

His expertise I think would elevate the results quite a bit. Although if he never uses LLMs, which it reads like he doesn't, I guess it might backfire just as well. Prompting style (still?) does matter after all, certainly in my experience anyways.

jph003d ago

He states in the article that they use LLMs for this purpose and find them extremely useful.

j / k navigate · click thread line to collapse

281 comments

rzmmm4d ago

Quote:

It's a good reminder for us all that the competition in this space is rough and lots of more or less subtle marketing is involved.

vidarh4d ago

It may well be that the hype was primarily marketing.

The other alternative is that Curl is simply secure enough that there was far less to find than in other projects.

h1fra4d ago

They might be biased by the fact that curl is significantly more secure than the average software

coldtea4d ago

>It's a good reminder for us all that the competition in this space is rough and lots of more or less subtle marketing is involved.

About as subtle as a personal injury lawyer's billboard

steve19774d ago

Better Call Dario

te_chris4d ago

A thankfully American reference

greendude294d ago

It's almost Trump-esque - "this model will change everything forever; we are doomed; we are saved; we will all be fired; we will all be rich", etc

xantronix4d ago

They need the hype to pay off way more than we do. So many of us who still write code directly stand to lose nothing of our capabilities if the marketing claims cannot hold water.

ehnto4d ago

apexalpha4d ago

> An amazingly successful marketing stunt for sure.

This. Well done by Antropic.

It even reached the CISO of my small semi-government org in the Netherlands, who slightly panicked at the announced 'tsunami' of vulnerabilities that was coming with Mythos.

Got us some more money and priority with the board, though.

Never waste a good marketing scare.

fpesce3d ago

Aurornis3d ago

Mythos isn’t released yet.

Anthropic noticed the trend of AI vulnerability scanning and started advertising Mythos, which is unreleased, as being very good at it.

jerf3d ago

Close enough that you can probably get a good sense of Mythos' performance by using GPT-5.5.

1 more reply

apexalpha3d ago

The LPEs were not found with Mythos but with existing, publicly available models.

stingraycharles3d ago

And also: they did an earlier run with Opus to discover bugs (like segfaults).

In February, Opus discovered a whole bunch of security related bugs, but didn’t exploit them.

Mythos, in turn, was fed these bugs and told to exploit them.

Not saying it’s not impressive, but it was literally told “here are all the places our metal detector says there may be gold, please find gold”.

1 more reply

yjftsjthsd-h3d ago

> bunch of old unseen-before OpenBSD/Linux RCEs,

AFAIK, the only thing it found in OpenBSD was a DoS?

Edit: For that matter, I'm not aware of RCEs in Linux, only LPE?

fpesce2d ago

The whole thing started with a talk from Nicholas Carlini mentioning a remote 20+ year old NFS vuln IIRC.

helloplanets3d ago

Anthropic has is quickly destroying customer goodwill by repeatedly pulling the same stunt. Horrible marketing, imho.

AgentME3d ago

Hasn't almost every new frontier model had an early period of limited access? I don't get why everyone is acting like Mythos is particularly egregious for this.

helloplanets3d ago

This is literally how they announced the model:

> We formed Project Glasswing because of capabilities we’ve observed in a new frontier model trained by Anthropic that we believe could reshape cybersecurity.

https://www.anthropic.com/glasswing

bamboozled3d ago

It is called "Mythos" dude...do you have any idea how mysterious and scary this sounds to most people and how much hype that alone can generate.

If the model was calle "Mini Mouse" it wouldn't feel anywhere near as threatening and interesting.

It sounds like the name of a cologne from the 70s or something and I like it.

aswegs83d ago

The bar has become so low lately that no one will care.

AlexCoventry3d ago

He describes in detail how curl is software-engineered to within an inch of its life. Do you really think most code is that highly polished?

Valakas_3d ago

Well done for convincing people of something that isn't true, in other words, well done for lying? Is this what is being cheered on? Seriously.

markus_zhang3d ago

org head is smart.

EMM_3863d ago

If an AI agent finds zero bugs in a software utility, how can that be viewed in the sense the AI agent is not very good at finding bugs?

What if there are actually zero bugs?

> Five issues felt like nothing as we had expected an extensive list.

zamadatix3d ago

The author considered the same w.r.t. remaining bugs:

> More to find

> We have not reached the end of this yet.

> I hope we can keep getting more curl scans done with Mythos and other AIs, over and over until they truly stop finding new problems.

yjftsjthsd-h4d ago

> Not particularly “dangerous”

I'm not sure that follows. As noted, curl was already analyzed to death with every tool available; most software isn't at that level.

anygivnthursday3d ago

But Mythos is not marketed as a tool that can do the same as other tools already available maybe slightly better, but as a revolution.

abirch3d ago

I'm agnostic with Anthropic/Mythos but if there aren't any vulnerabilities there it's hard to find it.

Until we find vulnerabilities in curl that Mythos missed, it's hard to say how good it is.

nicce3d ago

Would have like to see analysis against curl repo where the commit level is one day after the Mythos training data cutoff. And disable access to the internet.

billyoneal2d ago

No. The goal posts are not disproving their marketing. The burden of proof is on the people doing the marketing.

matltc3d ago

Mythos is either dangerous or not. We are taking dangerous to mean that the number of vulns it finds will be much greater than bugs found with available tools.

Since mythos found only one additional vuln, and since x+1 is not much greater than x, it follows that mythos is not dangerous per the definition above.

skybrian3d ago

It doesn’t follow because the results for curl don’t necessarily generalize to other codebases. It’s evidence against Mythos being particularly dangerous, but it’s just one datapoint.

It doesn’t invalidate the other security bugs Mythos allegedly found in other codebases.

bilekas4d ago

vidarh4d ago

The "not particularly dangerous" is a headline for a section talking about Mythos, not the vulnerability.

bilekas4d ago

Ah okay, that makes a bit more sense. I read it wrong. Then the comment is absolutely fair.

Ekaros4d ago

croon3d ago

Sure, but isn't it a verdict on Mythos compared to other models?

Sharlin3d ago

https://daniel.haxx.se/blog/2026/04/22/high-quality-chaos/, linked from TFA

galangalalgol3d ago

Is there a list of infrastructure that has received this kind of focus? Clearly people are looking at the linux kernel, hopefully openssl?

snackerblues2d ago

From article:

srcreigh3d ago

I can't help but think that curl is, by nature, a relatively simple and well-contained tool. Compare to an operating system or web browser or database or billion dollar company codebase.

It makes some sense that Mythos/ChatGPT 5.5 might be that much better with complexities that curl just doesn't have because it's a basic tool.

Like yeah curl is obviously extremely fully featured as an "anything client" but it's orders of magnitude less complex than other software we rely on.

sausagefeet3d ago

Curl is a lot more complicated than, I believe, you think. Most people know of it simply as a CLI to hit an HTTP(S) endpoint and write it out. But:

1. It supports basically any file transfer protocol.

2. It is a library that is designed for long running processes.

3. Because it's designed for long running processes, it makes use of every trick it can to pipeline and re-use connections and resources.

4. It has an asynchronous API so it can be integrated into any existing event loop.

Is a web browser or database more complicated? Most certainly, they solve really massive problems. But curl is certainly more complicated than probably most application code that uses it.

joelthelion3d ago

I agree it's rather basic but as stated in the article, its code is still longer than war and peace. There is still plenty of opportunities for security vulnerabilities in something of that size.

breakpointalpha3d ago

From the post:

I wouldn't call that simple or well contained...

Most OS or web browsers don't run on cars or tvs.

andriy_koval3d ago

someone(mythos?) should write some simple-curl with 20% of features implemented in rust used by 98% of users.

bilekas4d ago

> The single confirmed vulnerability is going to end up a severity low CVE planned to get published in sync with our pending next curl release 8.21.0 in late June

My mind still cannot understand the quality and refinement that's gone into cURL. It really is the perfect example of something done so right, that people barely think twice about.

pjmlp3d ago

Easy, it shows what is achievable if there is a high bar for quality in every single line of code that gets commited, reviewed and merged, regardless of the programming language.

However in the days of race to bottom, offshoring for penies, and now LLM powered code generation, this is a quality most companies won't care unless there is liability in place.

bilekas3d ago

> Easy, it shows what is achievable if there is a high bar for quality in every single line of code that gets commited

pjmlp3d ago

Which is liability is relevant, that is the only language shareholders understand.

1 more reply

dotancohen4d ago

ontouchstart3d ago

Some people must be working on training some models exclusively on high quality OSS code base like curl and SQLite without the noise of low quality training data.

I would do that with 100% local models from scratch.

TacticalCoder3d ago

> My mind still cannot understand the quality and refinement that's gone into cURL. It really is the perfect example of something done so right, that people barely think twice about.

And all that to then end with people doing: "curl ... | bash" and not seeing anything wrong about it. Then they'll deflect about "threat models" and other non-sense.

I leave you your curl-bash, I keep my cryptographically signed packages installer.

patrickmeenan3d ago

2001zhaozhao3d ago

jrflo3d ago

1 more reply

AntiUSAbah4d ago

There is always marketing involved and people should be able to put marketing into perspective.

Mythos is still not known at all what it can do. What does it mean from cost and benchmark pov to have a 10 Trillion parameter model?

In worst case, use Anthropics marketing to state that its a must now and something changed.

Tade03d ago

> What does it mean from cost and benchmark pov to have a 10 Trillion parameter model?

u_fucking_dork3d ago

> Mythos is still not known at all what it can do.

And this is very much on purpose my friend. Think about what people already believe it can do though.

flohofwoe3d ago

> Nonetheless, the fact that LLMs got significant better in finding this, better than humans, started to happen half a year ago?

AntiUSAbah3d ago

You don't have to dismantle a comment on a microlevel.

It has been clear for ages that certain type of bugs or issues are better solved from software.

But there was still plenty of things a proper SecOps Person would be able to find with help from tooling which automatic tooling wouldn't find.

Taking a limited amount of resources and focusing on the critical things.

I do think this is gone now. Same with Threat modeling etc.

pixl973d ago

Static analyzers are balls. For every real bug they find you are dealing with with piles of false positives and negatives.

flohofwoe3d ago

ahofmann4d ago

Putting on my tinfoil-hat: Sooo, the guy who runs the test and delivers the report could just have removed the more interesting bugs and delivered those to any three letter agency?

bilekas4d ago

No, based on cURL's history, it really seems like they would love to have found a really novel bug. Now if it was a for profit company.. Tinfoil hat would be shared!

AnssiH4d ago

The test was run by an unnamed third party, so cURL's history has no relevance to their benevolence.

Ekaros4d ago

cakealert4d ago

> No, based on cURL's history, it really seems like they would love to have found a really novel bug.

You just confirmed that you didn't read the article.

"Eventually, I was instead offered that someone else, who has access to the model, could run a scan and analysis on curl for me using Mythos and send me a report."

bilekas4d ago

I'm not sure how that proves I didn't read the article ?

NitpickLawyer3d ago

countWSS3d ago

mschuster913d ago

> Here and on reddit, AI debugging is viewed as some weird shallow pattern-matching that obviously fails to spot real stuff and overload the maintainers.

[1] https://www.zdnet.com/article/how-fake-security-reports-are-...

[2] https://opensourcewatch.beehiiv.com/p/linux-gets-cve-securit...

pixl973d ago

The internet is hostile. Software is hostile. There are sharks looking to eat you.

Time to face that fact.

u_fucking_dork3d ago

> And then there's the team at mozilla

And then there’s the team at curl. Don’t fall for the cheap marketing stuff just because you like them

Everything points to Mythos being marginally better and nobody being able to afford to run it.

addedGone3d ago

Many can afford it, many companies are easily spending 15-30K a month in tokens per staff.

ZrArm3d ago

> Unless you have reasons to not believe the dozens of credentialed, well respected people in the field that have already shared their opinions after working with mythos.

> Unless you're going to accuse them of being shills, these are unquestionable numbers. The model is quantitatively better at this thing

They found another ~150 bugs after their first announce, and only like ~35 were found by Mythos. It's already very sharp drop in contribution.

> I think there are better things to accuse anthropic of, than that they are simply lying for marketing purposes.

doctorpangloss3d ago

HN readers really like the idea of Daniel, he's sticking it to the man... by doing free publicity and work for the man haha

hmokiguess3d ago

nevi-me3d ago

> These tools and the analyses they have done have triggered somewhere between two and three hundred bugfixes merged in curl through-out the recent 8-10 months or so.

If you've just gone through a lengthy analysis of your code with other AI tools, surely it's reasonable not to expect to see hundreds more from a new tool?

It should be possible, unless more bugs are introduced, to eventually get to a state where there are no more bugs in your code.

Process aside, it sounds like Daniel expected to find dozens/hundreds more bugs.

jaapz3d ago

Mythos was kind of hyped as the tool that would discover much more bugs than any currently available tool

pbmonster3d ago

But Mythos found 1. After all that hype. 1.

knowaveragejoe3d ago

Maybe curl is just... better hardened? Firefox posted hundreds in April.

tgtweak3d ago

Anyway, I think the case that frontier and next-gen models will get increasingly adept at finding vulnerabilities and that those on the receiving end of those vulnerabilities need to be on top of it.

ostif-derek2d ago

Unfortunately that doesn't help much. LLMs are really really good at digging up known vulns, so much so that they often falsely declare known vulns as new and novel ones.

They have the CVEs in their training data, know how to look up ossfuzz logs, etc.

andromaton3d ago

If priced like other Anthropic models, Mythos will make vulnerability discovery a lot more accessible.

The author compares it to AISLE, ZeroPath, and OpenAI’s Codex Security. AISLE and ZeroPath are much more expensive. OpenAI’s Codex Security is gated.

Most people don't care about the first two and don't complain about the latter's policy because they are all specialized models and/or harnesses.

Mythos will be available to all.

vibedev9993d ago

> AISLE and ZeroPath are much more expensiv

AISLE is *cheaper* for sure

mohsen14d ago

[0] https://tsz.dev

_pdp_4d ago

The new Opus feels like a step backwards. More expensive, thinks more, and it does not get the job done.

vincent_s3d ago

From a user’s perspective 4.7 is a downgrade compared to 4.6 . It’s intended to give Anthropic more control about their compute resources and profitability:

https://news.ycombinator.com/item?id=48072916

dyauspitr4d ago

Having never used Claude and only Codex, does Claude actually say “this is too involved” as a response to a prompt?

mohsen14d ago

Yes it does. Usually after hours of working and not getting results

redditor986543d ago

I am curious, what kind of work do you use Claude for that sometimes requires hours of working. In my case, I have never seen it go off for more than 10 mins and even that is very rare.

jorisw3d ago

Love Daniel's writing style here. Fact based, concise, easy to read

ilia-a3d ago

IMHO Mythos was more of a marketing ploy.

romaniv3d ago

Semkas3d ago

tuananh3d ago

Interesting. curl team found that Mythos is mostly hype while Calif team found Mythos amazing.

I would think Calif (a security firm) is a better team to better utilize such tool.

vb-84483d ago

I guess we miss fundamental information: how much in terms of time and token usage took to the "middle guy" to create the report?

Next question: could it be that OP can use Mythos in a better way since he knows better the project?

absynth4d ago

The point wasn't actual cross-platform portability even though that was a nice side effect. It was to flush out all the weird edge cases.

tedd4u3d ago

It's also a convenient way to get press (and investor valuation) for a new model with releasing it (word is they don't have enough hardware to do so).

23aqsI2d ago

Like clockwork, criticism of the Alpha Omega apparatchiks is flagged. They know how to protect their income streams while open source authors get nothing.

yjftsjthsd-h4d ago

> The source code consists of 660,000 words, which is 12% more words than the entire English edition of the novel War and Piece.

Typo, or is there a spoof I should go read?

dotancohen4d ago

Perhaps he was dictating.

Does it say anything else? Just 'Aaaarggghhhh'?

Hamuko4d ago

Doubt it considering that Daniel Stenberg is Swedish. English dictation when you speak English as a second language with an accent is quite annoying.

Tistron4d ago

Source: voice typing this with Swedish vocal chords, and only had to correct "different lives" to "differently", and add /[^\w\s]/.

iso16314d ago

War and Peace is about 590,000 words. Tiny compared to the full Harry Potter collection (about 1 million words over the 7 books), but long for a single book.

perching_aix4d ago

They're referring to the typo in the title, "Piece" vs "Peace".

[0] comparing the output of `wc -w` with the word counts of books I'm reasonably sure will be super off

plexescor3d ago

I personally belive its a marketing stunt and they are just using actual humans to find the bugs/vulns

theaniketmaurya3d ago

Who is using Mythos to find these things and where do they run it?

jedisct13d ago

Swival found many more vulnerabilities without Mythos https://github.com/swival/security-audits

nottorp3d ago

> (I am purposely leaving out the identity of the individual(s) involved in getting the curl analysis done as it is not the point of this blog post.)

I would very much like to know if they were independent or affiliated to Anthropic.

> My personal conclusion can however not end up with anything else than that the big hype around this model so far was primarily marketing.

... because of this.

brunoborges3d ago

AI not finding a security issue on cURL has more to do with lack of widespread security issues than the model's capacity of finding them.

AtNightWeCode3d ago

A problem is that these tools seems smarter than they are cause they already read seen the answer key.

1 more reply

readthenotes13d ago

Kinda burying the lede: AI tools found over a dozen CVEs in curl last year, and hundreds of bugs.

toraway3d ago

Not exactly "burying the lede" since Daniel already posted an update about it months ago [1] with extensive discussion in numerous of articles [2] including on this site [3].

[1] https://lists.haxx.se/pipermail/daniel/2025-September/000127...

[2] https://www.theregister.com/software/2025/10/02/curl-project...

[3] https://news.ycombinator.com/item?id=45449348

jongjong3d ago

utopiah3d ago

perching_aix4d ago

It's a shame he seems to reject the idea of actually diving in and using these tools interactively:

> It’s not that I would have a lot of time to explore lots of different prompts and doing deep dive adventures anyway.

jph003d ago

He states in the article that they use LLMs for this purpose and find them extremely useful.

j / k navigate · click thread line to collapse