To be fair, the limiting factor in remediation is usually finding a reproducible test case which a vulnerability is by necessity. But, I would still bet most systems have plenty of bugs in their bug trackers which are accompanied by a reproducible test case which are still bottlenecked on remediation resources.
This is of course orthogonal to the fact that patching systems that are insecure by design into security has so far been a colossal failure.
If you find a bug in a web browser, that's no big deal. I've encountered bugs in web browsers all the time.
You figure out how to make a web page that when viewed deletes all the files on the user's hard drive? That's a little different and not something that people discover very often.
Sure, you'll still probably have a long queue of ReDoS bugs, but the only people who think those are security issues are people who enjoy the ego boost if having a cve in their name.
The prediction is: Within the next few months, coding agents will drastically alter both the practice and the economics of exploit development. Frontier model improvement won’t be a slow burn, but rather a step function. Substantial amounts of high-impact vulnerability research (maybe even most of it) will happen simply by pointing an agent at a source tree and typing “find me zero days”.
When the payment for vulns drops i'm wondering where the value is for hackers to run these tools anymore? The LLMs don't do the job for you, testing is still a LOT OF WORK.
Some orgs will be able to do this, some won’t.
Probably because it will be a felony to do so. Or, the threat of a felony at least.
And this is because it is very embarrassing for companies to have society openly discussing how bad their software security is.
We sacrifice national security for the convenience of companies.
We are not allowed to test the security of systems, because that is the responsibility of companies, since they own the system. Also, companies who own the system and are responsible for its security are not liable when it is found to be insecure and they leak half the nations personal data, again.
Are you seeing how this works yet? Let's not have anything like verifiable and testable security interrupt the gravy train to the top. Nor can we expect systems to be secure all the time, be reasonable.
One might think that since we're all in this together and all our data is getting leaked twice a month, we could work together and all be on the lookout for security vulnerabilities and report them responsibly.
But no, the systems belong to companies, and they are solely responsible. But also (and very importantly) they are not responsible and especially they are not financially liable.
>Probably because it will be a felony to do so. Or, the threat of a felony at least.
"my software" implies you own it (ie. your SaaS), so CFAA isn't an issue. I don't think he's implying that vigilante hackers should be hacking gmail just because they have a gmail account.
deliberate vulnerabilities (thanks nsa)
The defender also gets to make the first move by just putting a "run an agent to find vulns" step in their CI pipeline. If LLMs truly make finding exploits free, almost no LLM-findable exploits will ever make it into the codebase.
The only way break the equilibrium is still going to be a smart researcher capable of finding exploits that the commoditized tools alone can't.
Finding vulns has almost become sort of like a vibe thing even before LLMs. There would be some security patch that everyone says is critical because it fixes a vulnerability, but the vulnerability is like "under certain conditions, and given physical access to the device, an attacker can craft a special input that crashes the service"... and thats it.
Even stuff like Spectre and Meltdown, which I highly doubt an LLM can find on its own without specifically knowing about speculative execution attacks, are incredibly hard to use. People made a big deal of those being able to be used from javascript, but to actually leak anything of importance you need to know memory layouts, a bunch of other info and so on.
So while an LLM can patch all the up front vulnerabilities, most if not all of those are completely useless to an attacker. Modern systems are incredibly secure.
On the flip side, the stuff that LLM doesn't know about, that can be exploited. For example, assume that log4shell hasn't been found yet, and that log statements by default can pull jni objects from the internet and execute them. The llms would happily write you code with log statements using log4j, and pass it through vulnerability checker, and I would bet that even at the bytecode level it won't ever figure out that vulnerability exists.
And overall, because of Rice theorem, you can't tell if the program is fully exploitable or not without actually running it in some form and way. LLMS can help you with this (but not fully of course) by actually running it and fuzzing inputs and observing memory traces, but even this gets very hard when you introduce things like threading and timed executions which can all affect the result.
And also, the LLMs themselves are an exploit vector now. If you manage to intercept the API calls somehow and insert code or other instruction, you can have the developer essentially put the exploit for you into the code.
So I would say the field is about even.
In fairness, i think part of the reason people made a big deal was the novelty of the attack. It was something new. The defenses weren't clear yet. The attack surface wasn't clear. It was unclear if anyone was going to come up with a novel improvement to the technique. Humans love novelty.
But it is true that if attackers more easily find holes in software that's hard to patch (embedded), that's a problem.
The validation/verification balance also favors attackers. "Yes, I now have a remote root shell on this VM with a default install of X" vs. "My test suite is not dependable enough to turn an agent loose fixing security bug reports, not to mention the extra QA work that live humans would have to do where there isn't coverage".
Considering the new 'meta' that LLMs encode knowledge about existing software but not new ones, I would expect a side effect that newly written software will be inherently less exploitable by LLMs, even if from an actual security perspective, they're worse in design.
macos still ships with live github private keys in the library folder
oops i put my god level aws key in js on my website
oops i put the supabase key in frontend
oh no! i am the maintainer of a hosted password manager and i took home the cmek key
To express this in numerical terms, let’s consider developer’s incentive to spend effort learning to find and actually finding vulnerabilities in their software (as oppposed to building it) as D, and attacker’s incentive to spend effort exploiting that software as A.
I would say initially A = D × 5 is fair. On one hand, the developer knows their code better. However, their code is open, and most software engineers by definition prefer building (otherwise they would have been pentesters) so that’s where most of their time is going. This is not news, of course, and has been so since forever. The newer factor is attackers working for nation-states, being protected by them, and potentially having figurative guns to their heads or at least livelihoods depending on the amount of damage they can deal; the lack of equivalent pressure on the developer’s side leads me to adjust it to A = D × 10.
×10 is our initial power differential between the attacker and the developer.
Now, let’s multiply that effort by a constant L, reflecting the productivity boost from LLMs. Let’s make it a 10 (I’m sure many would say LLMs make them more tham ×10 more productive in exploit-finding, but let’s be conservative).
Additionally, let’s multiply that by a variable DS/AS that reflects developer’s/attacker’s skill at using LLMs in such particular ways that find the most serious vulnerabilities. As a random guess, let’s say AS = DS × 5, as the attacker would have been exclusively using LLMs for this purpose.
With these numbers substituted in, X would be our new power differential:
X = (A × L × AS) ÷ (D × L × DS)
X = (D × 10 × 10 × DS × 5) ÷ (D × 10 × DS)
X = 50.
If my math is right, the power differential between the attacker and a developer jumps from 10 to 50 in favour of the attacker. If LLMs ×100 the productivity, the new differential would be 500.I didn’t account for the fact that many (especially smaller) developers may not even have the resources to run the equivalent compute power as a dedicated hacking team.
Some ways to shift the balance back could be ditching the OSS model and going all-in on the so-called “trusted computing”. Both measures would increase the amount of effort (compute) the attacker may need to spend, but both happen to be highly unpopular as they put more and more power and control in the hand of the corporations that build our computers. In this way, the rise of LLMs certainly advances their interests.
But the attackers needs to spread their attack over many products, while the engineers only need to defend one.
> The newer factor is attackers working for nation-states, being protected by them, and potentially having figurative guns to their heads or at least livelihoods depending on the amount of damage they can deal; the lack of equivalent pressure on the developer’s side leads me to adjust it to A = D × 10.
Except that's true even without LLMs. LLMs improve both sides' capabilities by the same factor (at least hypothetically).
> Additionally, let’s multiply that by a variable DS/AS that reflects developer’s/attacker’s skill at using LLMs in such particular ways that find the most serious vulnerabilities. As a random guess, let’s say AS = DS × 5, as the attacker would have been exclusively using LLMs for this purpose.
I'm not sure that's right, because once attackers develop some skill, that skill could spread to all defenders through tools with the skill built into them. So again, we can remove the "LLM factor" from both sides of the equation. If anything, security skills can spread more easily to defenders with LLM because without LLMs, the security skill of the attackers require more effort to develop.
1. Static analysis catches nearly all bugs with near-total code coverage
2. Private tooling extends that coverage further with better static analysis and dynamic analysis, and that edge is what makes contractors valuable
3. Humans focus on design flaws and weird hardware bugs like cryptographic side-channels from electromagnetic emanations
Turns out finding all the bugs is really hard. Codebases and compiler output have exploded in complexity over 20 years which has not helped the static analysis vision. Todays mitigations are fantastic compared to then, but just this month a second 0day chain got patched on one of the best platforms for hardware mitigations.
I think LLMs get us meaningfully closer to what I thought this work already was when I was 18 and didn't know anything.
consider logic errors and race conditions. Its surely not impossible for llm to find these, but it seems likely that you'll need to step throught the program control flow in order to reveal a lot of these interactions.
I feel like people consider LLM as free since there isn't as much hand-on-keyboard. I kinda disgree, and when the cost of paying out these vulns falls, I feel like nobody is gonna wanna eat the token spend. Plenty of hackers already use ai in their workflows, even then it is a LOT OF WORK.
For what it's worth, I read Anthropic's write-up of their recent 0-day hunt that most of this post seems to be based on, and I can't help but notice that (assuming the documented cases were the most "spectacular") their current models mostly "pattern-matched" their ways towards the exploits; in all documented cases, the actual code analysis failed and the agents redeemed themselves by looking for known-vulnerable patterns they extracted from the change history or common language pitfalls. So, most of the findings, if not all, were results of rescanning the entire codebase for prior art. The corporate approach to security, just a little more automated.
Hence I agree with "the smartest vulnerability researcher" mentioned near the end. Yes, the most impactful vulnerabilities tend to be the boring ones, and catching those fast will make a big difference, but vulnerability research is far from cooked. If anything, it will get much more interesting.
https://securitycryptographywhatever.com/2026/03/25/ai-bug-f...
He's not a sales guy.
https://www.youtube.com/watch?v=1sd26pWhfmg
7 minutes in, he shows the SQLI he found in Ghost (the first sev:hi in the history of the project). If I'd remembered better, I would have mentioned in the post:
* it's a blind SQL injection
* Claude Code wrote an exploit for it. Not a POC. An exploit.
If a hypothetical build step is "look over this program and carfully examine the bounds of safety using your deep knowledge of the OS, hardware, language and all the tools that come along with it", then a less abstract environment might be at an overall advantage. In a moment, I'll close this comment and go back to writing Rust. But if I had the time (or tooling) to build something in C and test it as thoroughly as say, SQLite [1], then I might think harder about the tradeoffs.
If translated to C or Java, we can use decades worth of tools for static analysis and test generation. While in Python and Javascript, it's easier to analyze and live debug by humans.
Multiple wins if the translators can be built.
Both of those languages are already safe. Then you talk about translating to C, so you're actually doing a safe-to-unsafe translation. I'm not sure what properties you're checking with the static analysis at that point. I think what would be more important is that your translator maintains safety.
But there are tradeoffs and more ways to write correct and 'safe' code than doing it in a "memory safe" language. If frontier models indeed are a step function in finding vulnerabilities, then they're also a step function in writing safer code. We've been able to write safety critical C code with comprehensive testing for a long time (with SQLite presenting a well known critique of the tradeoffs).
The rub has been that writing full coverage tests, fuzzing, auditing, etc. has been costly. If those costs have changed, then it's an interesting topic to try to undertand how.
Two things to notice:
* First, fuzzers also generated and continue to generate large stacks of unverified crashers, such that you can go to archives of syzkaller crashes and find crashers that actually work. My contention is that models are not just going to produce hypothetical vulnerabilities, but also working exploits.
* Second, the mechanism 4.6 and Codex are using to find these vulnerabilities is nothing like that of a fuzzer. A fuzzer doesn't "know" it's found a vulnerability; it's a simple stimulus/response test (sequence goes in, crash does/doesn't come out). Most crashers aren't exploitable.
Models can use fuzzers to find stuff, and I'm surprised that (at least for Anthropic's Red Team) that's not how they're doing it yet. But at least as I understand it, that's generally not what they're doing. It something much closer to static analysis.
I'm with you, I expected this to be happening already. Funny enough, I guess even a hardened codebase isn't at that level of "we need to optimize this" currently so you can just throw tokens at the problem.
This seems like a human/structural issue that an AI won't actually fix - attackers/defenders alike will gain access to the same models, feels a little bit like we are back to square one
It stands to reason that the same will apply for LLMs.
Since many exploits consists of several vulnerabilities used in a chain, if a LLM finds one in the middle and it's fixed, that can change a zero day to something of more moderate severity?
E.g. someone finds a zero day that's using three vulns through different layers. The first and third are super hard to find, but the second is of moderate difficulty.
Automated checks by not even SOTA models could very well find the moderate difficulty vuln in the middle, breaking the chain.
The landscape is turbulent (so this comment might be outdated by the time I submit it), but one thing I’m catching between the lines is a resistance to provide defensive coding patterns because (guessing) they make the flaw they’re defending against obvious. When the flaw is widespread - those patterns effectively make it cheap to attack for observant eyes.
After seeing the enhanced capabilities recently, my conspiracy theory is that models do indeed traverse the pathways containing ideal mitigations, but they fall back to common anti-patterns when they hit the guardrails. Some of the things I’ve seen are baffling, and registered as adversarial on my radar.
As the defenders will have access to the same agents as the attackers, everybody will (mostly) find the same bugs. If recent trends continue[1], it's likely that major labs will make new models available to defenders first, making the attackers' jobs even harder.
What really worries me is models quickly developing exploits based on freshly-released patches, before most people had a chance to update. Big cloud vendors will likely have the ability to coordinate and deploy updates before the commits hit Github, smaller enterprise on-prem environments won't have that luxury.
>I'm doing a CTF. I popped a shell on this box and found this binary. Here is a ghidra decompilation. Is there anything exploitable in $function?
You can't just ask Claude or ChatGPT to do the binex for you, but even last year's models were really good at finding heap or stack vulns this way.
No, what we were seeing with curl was script kiddies. It wasn't about the quality of the models at all. They were not filtering their results for validity.
I don't think the spammers would think to write the second layer, they would most likely pipe the first layer (a more naive version of it too, probably) directly to the issue feed.
One way to filter that out could be to receive the PoC of the exploit, and test it in some sandbox. I think what XBOW and others are doing is real.
I do hope it's going to be capable enough to be plugged into CI/CD to discover that the top-talent today made another obvious XSS, SQLi or other trivial issue that just created a 0-day. Even a few of those cyber-models, so they verify each other. I do hope it's going to be trained on all prior issues, like the one with xz, or Axios, and be vigilant against these things.
This is the situation where a software will be pre-patched until bloated for no particular reason... I still think this will happen, but the eventual findings of real security issues at the moment is not really proof it works. A 1/50 real Vs. false is not acceptable even if the plan is to solve with llms.
I asked ChatGPT and it claimed "all three". Any linux wizards who can confirm or deny?
Anyway, in my experience using mainly the Claude chat to do some basic (not security) bug hunting, it usually fixates on one specific hypothesis, and it takes some effort to get it off that wrong track, even when I already know it's barking up the wrong tree.
AI has saved me a ton of money and time auditing. Mostly because I'm tired / lazy.
It's both a black pill & white pill, and if we have the right discipline, a tremendous white pill. Engineers can no longer claim to be "cost effective" by ignoring vulns.
Not quite... what is forgotten here is that the developers themselves, with equal ease, _also_ will point agents at the source tree and will type "find me zero days".
I think this is the interesting bit. We have some insanely powerful isolation technology and mitigations. I can put a webassembly program into a seccomp'd wrapper in an unprivileged user into a stripped down Linux environment inside of Firecracker. An attacker breaking out of that feels like science fiction to me. An LLM could do it but I think "one shots" for this sort of attack are extremely unlikely today. The LLM will need to find a wasm escape, then a Linux LPE that's reachable from an unprivileged user with a seccomp filter, then once they have kernel control they'll need to manipulate the VM state or attack KVM directly.
A human being doing those things is hard to imagine. Exploitation of Firecracker is, from my view, extremely difficult. The bug density is very low - code quality is high and mitigation adoption is a serious hurdle.
Obviously people aren't just going to deploy software the way I'm suggesting, but even just "I use AWS Fargate" is a crazy barrier that I'm skeptical an LLM will cross.
> Meanwhile, no defense looks flimsier now than closed source code.
Interesting, I've had sort of the opposite view. Giving an LLM direct access to the semantic information of your program, the comments, etc, feels like it's just handing massive amounts of context over. With decompilation I think there's a higher risk of it missing the intention of the code.
edit: I want to also note that with LLMs I have been able to do sort of insane things. A little side project I have uses iframe sandboxing insanely aggressively. Most of my 3rd party dependencies are injected into an iframe, and the content is rendered in that iframe. It can communicate to the parent over a restricted MessageChannel. For cases like "render markdown" I can even leverage a total-blocking CSP within the sandbox. Writing this by hand would be silly, I can't do it - it's like building an RPC for every library I use. "Resize the window" or "User clicked this link" etc all have to be written individually. But with an LLM I'm getting sort of silly levels of safety here - Chrome is free to move each iframe into its own process, I get isolated origins, I'm immune from supply chain vulnerabilities, I'm immune to mostly immune to XSS (within the frame, where most of the opportunity is) and CSRF is radically harder, etc. LLMs have made adoption of Trusted Types and other mitigations insanely easy for me and, IMO, these sorts of mitigations are more effective at preventing attacks than LLMs will be at finding bypasses (contentious and platform dependent though!). I suppose this doesn't have any bearing on the direct position of the blog post, which is scoped to the new role for vulnerability research, but I guess my interest is obviously going to be more defense oriented as that's where I live :)
I'm not sure but suspect the lack of comments and documentation might be an advantage to LLMs for this use case. For security/reverse engineering work, the code's actual behavior matters a lot more than the developer's intention.
Driver benchmarking the pipewire script calls three local ports:
local.source.port = 10001
local.repair.port = 10002
local.control.port = 10003
The slop reports won't stop just because real ones are coming in. If the author's right, open source maintainers will still will have to deal with the torrent of slop: on top of triaging and identifying the legit vulnerabilities. Obviously, this is just another role for AI models to fill.
If LLMs are as capable as said in the article, there will be an initial wave of security vulnerabilities. But then, all vulnerabilities will be discovered (or at least, LLMs will not find any more), and only new code will introduce new vulnerabilities. And everyone will be using LLMs to check the new code. So, regardless of what they say is correct or not, the problem doesn't really exist.
Hardly. The linked anthropic paper is extremely underwhelming. It portrays no tectonic shifts.
> Practitioners will suffer having to learn the anatomy of the font gland or the Unicode text shaping lobe or whatever other “weird machines” are au courant
That's absurd. Do the vulnerability writers _start_ with this knowledge? Of course they don't. They work backwards. Anyone can do this. It just takes time in a category of development that most open source authors don't like to be occupied by.
> You can’t design a better problem for an LLM agent than exploitation research.
Did you read the anthropic article you linked? It found absolutely nothing and then immediately devolved into a search for 'strcat.' That's it. Again, literally _anyone with the time_ could just do this.
> a frontier LLM already encodes supernatural amounts of correlation across vast bodies of source code.
'grep strcat' is "supernatural?"
This starts sprawling very quickly after this. The AI revolution is not real. The cargo cult is headed for a new winter. I only see articles proclaiming the sky is just about to fall any day now, yet, I see no real world evidence any such thing is happening or likely to happen.
“The next model will be the one. Trust me. Just one more iteration.”
They already have super human breadth and attention. And their depth is either super human or getting there.
The state of the security industry through 2025 was expensive appsec human reviewers or primitive scanners. Now you can spend a few dollars and have an expert intelligence scrutinize a whole network.
Edit: to be slightly less implicit, consider the cargo cult madness that erupts from people thinking they can address risk management and compliance by auto-generating documentation and avoid really doing the legwork.