The coming industrialisation of exploit generation with LLMs (opens in new tab)

(sean.heelan.io)

265 pointslong3mo ago170 comments

170 comments

> In the hardest task I challenged GPT-5.2 it to figure out how to write a specified string to a specified path on disk, while the following protections were enabled: address space layout randomisation, non-executable memory, full RELRO, fine-grained CFI on the QuickJS binary, hardware-enforced shadow-stack, a seccomp sandbox to prevent shell execution, and a build of QuickJS where I had stripped all functionality in it for accessing the operating system and file system. To write a file you need to chain multiple function calls, but the shadow-stack prevents ROP and the sandbox prevents simply spawning a shell process to solve the problem. GPT-5.2 came up with a clever solution involving chaining 7 function calls through glibc’s exit handler mechanism.

Yikes.

ahartmetz3mo ago

Maybe we can remove mitigations. Every exploit you see is: First, find a vulnerability (the difficult part). Then, drill through five layers of ultimately ineffective "mitigations" (the tedious but almost always doable part).

Probabilistic mitigations work against probabilistic attacks, I guess - but exploit writers aren't random, they are directed, and they find the weaknesses.

GaggiX3mo ago

The vulnerability was found by Opus:

"This is true by definition as the QuickJS vulnerability was previously unknown until I found it (or, more correctly: my Opus 4.5 vulnerability discovery agent found it)."

2 more replies

staticassertion3mo ago

Most mitigations just flat out do not attempt to help against "arbitrary read/write". The LLM didn't just find "a vuln" and then work through the mitigations, it found the most powerful possible vulnerability.

Lots of vulnerabilites get stopped dead by these mitigations. You almost always need multiple vulnerabilities tied together, which relies on a level of vulnerability density that's tractable. This is not just busywork.

1 more reply

titzer3mo ago

There are so many holes at the bottom of the machine code stack. In the future we'll question why we didn't move to WASM as the universal executable format sooner. Instead, we'll try a dozen incomplete hardware mitigations first to try to mitigate backwards crap like overwriting the execution stack.

shakna3mo ago

Escaping the sandbox has been plenty doable over the years. [0]

WASM adds a layer, but the first thing anyone will do is look for a way to escape it. And unless all software faults and hardware faults magically disappear, it'll still be a constant source of bugs.

Pitching a sandbox against ingenuity will always fail at some point, there is no panacea.

[0] https://instatunnel.substack.com/p/the-wasm-breach-escaping-...

verall3mo ago

> In the future we'll question why we didn't move to WASM as the universal executable format sooner

I hope not, my laptop is slow enough as it is.

rvz3mo ago

Tells you all you need to know around how extremely weak a C executable like QuickJS is for LLMs to exploit. (If you as an infosec researcher prompt them correctly to find and exploit vulnerabilities).

> Leak a libc Pointer via Use-After-Free. The exploit uses the vulnerability to leak a pointer to libc.

I doubt Rust would save you here unless the binary has very limited calls to libc, but would be much harder for a UaF to happen in Rust code.

cookiengineer3mo ago

The reason I value Go so much is because you have a fat dependency free binary that's just a bunch of syscalls when you use CGO_ENABLED=0.

Combine that with a minimal docker container and you don't even need a shell or anything but the kernel in those images.

2 more replies

pizlonator3mo ago

Yeah Fil-C to the rescue

(I’m not trying to be facetious or troll or whatever. Stuff like this is what motivated me to do it.)

tptacek3mo ago

"C executables" are most of the frontier of exploit development, which is why this is a meaningful model problem.

1 more reply

lelanthran3mo ago

> Tells you all you need to know around how extremely weak a C executable like QuickJS is for LLMs to exploit. (If you as an infosec researcher prompt them correctly to find and exploit vulnerabilities).

Wouldn't GP's approach work with any other executable using libc? Python, Node, Rust, etc?

I fail to see what is specific to either C or QuickJS in the GP's approach.

vsgherzi3mo ago

Wouldn’t the idea be to not have the uaf to begin with? I’d argue it saves you very much by making the uaf way harder to write. Forcing unsafe and such.

cookiengineer3mo ago

> glibc's exit handler

> Yikes.

Yep.

arthurcolle3mo ago

Life, uh, finds a way

1 more reply

jdefr893mo ago

Most modern kill chains involve chaining together that many bugs... I know because it's my job and its become demoralizing.

catoc3mo ago

So much for ‘stochastic parrots’

moron4hire3mo ago

> The exploits generated do not demonstrate novel, generic breaks in any of the protection mechanisms.

1 more reply

saagarjha3mo ago

> The exploits generated do not demonstrate novel, generic breaks in any of the protection mechanisms. They take advantage of known flaws in those protection mechanisms and gaps that exist in real deployments of them. These are the same gaps that human exploit developers take advantage of, as they also typically do not come up with novel breaks of exploit mitigations for each exploit.

I actually think this result is a little disappointing but I largely chalk it up to the limited budget the author invested. In the CTF space we’re definitely seeing this more and more as models effectively “oneshot” typical pwn tasks that were significant effort to do by hand before. I feel like the pieces to do these are vaguely present in training data and the real constraints have been how fiddly and annoying they are to set up. An LLM is going to be well suited at this.

More interestingly, though, I suspect we will actually see software at least briefly get more secure as a result of this: I think a lot of incomplete implementations of mitigations are going to fall soon and (humans, for now) will be forced to keep up and patch them properly. This will drive investment in formal modeling of exploits, which is currently a very immature field.

rramadass3mo ago

> formal modeling of exploits, which is currently a very immature field.

Can you elaborate more on this with pointers to some resources?

saagarjha3mo ago

I think a lot of work that went into mitigating Spectre has been a good example since it’s very easy to patch incorrectly if you don’t have a good model of the vulnerability and what it allows

er4hn3mo ago

I think the author makes some interesting points, but I'm not that worried about this. These tools feel symmetric for defenders to use as well. There's an easy to see path that involves running "LLM Red Teams" in CI before merging code or major releases. The fact that it's a somewhat time expensive (I'm ignoring cost here on purpose) test makes it feel similar to fuzzing for where it would fit in a pipeline. New tools, new threats, new solutions.

digdugdirk3mo ago

That's not how complex systems work though? You say that these tools feel "symmetric" for defenders to use, but having both sides use the same tools immediately puts the defenders at a disadvantage in the "asymmetric warfare" context.

The defensive side needs everything to go right, all the time. The offensive side only needs something to go wrong once.

Vetch3mo ago

I'm not sure that's the fully right mental model to use. They're not searching randomly with unbounded compute nor selecting from arbitrary strategies in this example. They are both using LLMs and likely the same ones, so will likely uncover overlapping possible solutions. Avoiding that depends on exploring more of the tail of the highly correlated to possibly identical distributions.

It's a subtle difference from what you said in that it's not like everything has to go right in a sequence for the defensive side, defenders just have to hope they committed enough into searching such that the offensive side has a significantly lowered chance of finding solutions they did not. Both the attackers and defenders are attacking a target program and sampling the same distribution for attacks, it's just that the defender is also iterating on patching any found exploits until their budget is exhausted.

psychoslave3mo ago

That really depends of the offensive class. If that is a single group with some agenda, then that's just everyone spending much resources on creating solution no permanent actor in the game want actually to escalate into, just show they have the tools and skills.

It's probably more worrying as you get script kiddies on steroids which can spawn all around with same mindset as even the dumbest significant geopolitical actor out there.

NitpickLawyer3mo ago

> These tools feel symmetric for defenders to use as well.

I don't think so. From a pure mathematical standpoint, you'd need better (or equal) results at avg@1 or maj@x, while the attacker needs just pass@x to succeed. That is, the red agent needs to work just once, while the blue agent needs to work all the time. Current agents are much better (20-30%) at pass@x than maj@x.

In real life that's why you sometimes see titles like "teenager hacks into multi-billion dollar company and installs crypto malware".

I do think that you're right in that we'll see improved security stance by using red v. blue agents "in a loop". But I also think that red has a mathematical advantage here.

rightbyte3mo ago

>> These tools feel symmetric for defenders to use as well.

> I don't think so. From a pure mathematical standpoint, you'd need better (or equal) results at avg@1 or maj@x, while the attacker needs just pass@x to succeed.

Executing remote code is a choice not some sort of force of nature.

Timesharing systems are inherently not safe and way too much effort is put into claiming the stone from Sisyphus.

SaaS and complex centralized software need to go and that is way over due.

1 more reply

azakai3mo ago

Yes, and these tools are already being used defensively, e.g. in Google Big Sleep

https://projectzero.google/2024/10/from-naptime-to-big-sleep...

List of vulnerabilities found so far:

https://issuetracker.google.com/savedsearches/7155917

pizlonator3mo ago

Not symmetric at all.

There are countless bugs to fund.

If the offender runs these tools, then any bug they find becomes a cyberweapon.

If the defender runs these tools, they will not thwart the offender unless they find and fix all of the bugs.

Any vs all is not symmetric

energy1233mo ago

LLMs effectively move us from A to B:

A) 1 cyber security employee, 1 determined attacker

B) 100 cyber security employees, 100 determined attackers

Which is better for defender?

1 more reply

0xDEAFBEAD3mo ago

How do bug bounties change the calculus? Assuming rational white hats who will report every bug which costs fewer LLM tokens than the bounty, on expectation.

1 more reply

hackyhacky3mo ago

> I think the author makes some interesting points, but I'm not that worried about this.

Given the large number of unmaintained or non-recent software out there, I think being worried is the right approach.

The only guaranteed winner is the LLM companies, who get to sell tokens to both sides.

pixl973mo ago

I mean you're leaving out large nation state entities

0xbadcafebee3mo ago

An LLM Red Team is going to be too expensive most people; an actual infosec company will need to write the prompts, vet them, etc. But you don't need that to find exploits if you're just a human sitting at a console trying things. The hackers still have the massive advantage of 1) time, 2) cost (it will cost them less than the defenders/Red-Team-As-a-SaaS), and 3) they only have to get lucky once.

SchemaLoad3mo ago

This + the fact software and hardware has been getting structurally more secure over time. New changes like language safety features, Memory Integrity Enforcement, etc will significantly raise the bar on the difficulty to find exploits.

amelius3mo ago

> These tools feel symmetric for defenders to use as well.

Why? The attackers can run the defending software as well. As such they can test millions of testcases, and if one breaks through the defenses they can make it go live.

er4hn3mo ago

Right, that's the same situation as fuzz testing today, which is why I compared it. I feel like you're gesturing towards "Attackers only need to get lucky once, defenders need to do a good job everytime" but a lot of the times when you apply techniques like fuzz testing it doesn't take a lot of effort to get good coverage. I suspect a similar situation will play out with LLM assisted attack generation. For higher value targets based on OSS, there's projects like Google Big Sleep to bring enhanced resources.

execveat3mo ago

Defenders have threat modeling on their side. With access to source code and design docs, configs, infra, actual requirements and ability to redesign / choose the architecture and dependencies for the job, etc - there's a lot that actually gives defending side an advantage.

I'm quite optimistic about AI ultimately making systems more secure and well protected, shifting the overall balance towards the defenders.

lateral_cloud3mo ago

Defenders have the added complexity of operating within business constraints like CAB/change control and uptime requirements. Threat actors don’t, so they can move quick and operate at scale.

bandrami3mo ago

For that matter is this in principle much different from a fuzzer?

nl3mo ago

One of the interesting things to me about this is that Codex 5.2 found the most complex of the exploits.

The reflects my experience too. Opus 4.5 is my everyday driver - I like using it. But Codex 5.2 with Extra High thinking is just a bit more powerful.

Also despite what people say, I don't believe progress in LLM performance is slowing down at all - instead we are having more trouble generating tasks that are hard enough, and the frontier tasks they are failing at or just managing are so complex that most people outside the specialized field aren't interested enough to sit through the explanation.

conception3mo ago

The Anthropic models are great workers/tool users. OpenAI Codex High is a great reviewer/fixer. Gemini is the genius repainting your bathroom walls into a Monet from memory because you mentioned once a few weeks ago you liked classical art and needed to repaint your bathroom. Gemini didn’t mention the task or that it was starting it. It did a pretty good job after you had to admit.

nl3mo ago

Disagree about Codex - it's great at doing things too!

Gemini either does a Monet or demolishes your bathroom and builds a new tuna fishing boat there instead, and it is completely random which one you get.

It's a great model but I rarely use it because it's so random as to what you get.

1 more reply

cellis3mo ago

The “hard enough” tasks are all behind IP walls. If it’s a “hard enough” that generally means it’s a commercial problem likely involving disparate workflows and requiring a real human who probably isn’t a) inclined and/or b) permitted, to publish the task. The incentives are aligned to capture all value from solving that task as long as possible and only then publish.

saagarjha3mo ago

I solve plenty of hard problems as a hobby

nitwit0053mo ago

It didn't find the exploits, it wrote code that made use of them. You can see them feeding it exploit descriptions, and samples of making use of them in their log files: https://github.com/SeanHeelan/anamnesis-release/blob/master/...

jdefr893mo ago

Vulnerability Researcher/Reverse Eng here... Aspects about it generating an API for read/write primitives are simply it regurgitating tons of APIs that exist already. Its still cool, but its not like it invented the primitives or any novel technique. Also, this toy JS is similar to binaries you'd find in a CTF. Of course it will be able to solve majority of those. I am curious though.. Latest OpenAI models don't seem to want to generate any real exploit code. Is there a prompt jail break or something being used here?

LeakedCanary3mo ago

I had similar questions when reading the original article. I’m also interested in how the agent is constructed. From my experience, it can be very difficult to implement exploits without access to debugging tools, so I’m curious whether pwndbg or similar tools are included in the agent’s toolset and, if so, how they are integrated. Existing open-source GDB MCPs don’t work very well unless further optimized, at least the last time I checked.

protocolture3mo ago

I genuinely dont know who to believe. The people who claim LLMs are writing excellent exploits. Or the people who claim that LLMs are sending useless bug reports. I dont feel like both can really be true.

simonw3mo ago

Why can't they both be true?

The quality of output you see from any LLM system is filtered through the human who acts on those results.

A dumbass pasting LLM generated "reports" into an issue system doesn't disprove the efforts of a subject-matter expert who knows how to get good results from LLMs and has the necessary taste to only share the credible issues it helps them find.

protocolture3mo ago

Theres no filtering mentioned in the OP article. It claims GPT only created working useful exploits. If it can do that, it could also submit those exploits as perfectly as bug reports?

2 more replies

anonymous9082133mo ago

They can't both be true if we're talking about the premise of the article, which is the subject of the headline and expounded upon prominently in the body:

  The Industrialisation of Intrusion

  By ‘industrialisation’ I mean that the ability of an organisation to complete a task will be limited by the number of tokens they can throw at that task. In order for a task to be ‘industrialised’ in this way it needs two things:

  An LLM-based agent must be able to search the solution space. It must have an environment in which to operate, appropriate tools, and not require human assistance. The ability to do true ‘search’, and cover more of the solution space as more tokens are spent also requires some baseline capability from the model to process information, react to it, and make sensible decisions that move the search forward. It looks like Opus 4.5 and GPT-5.2 possess this in my experiments. It will be interesting to see how they do against a much larger space, like v8 or Firefox.
  The agent must have some way to verify its solution. The verifier needs to be accurate, fast and again not involve a human.

"The results are contigent upon the human" and "this does the thing without a human involved" are incompatible. Given what we've seen from incompetent humans using the tools to spam bug bounty programs with absolute garbage, it seems the premise of the article is clearly factually incorrect. They cite their own experiment as evidence for not needing human expertise, but it is likely that their expertise was in fact involved in designing the experiment[1]. They also cite OpenAI's own claims as their other piece of evidence for this theory, which is worth about as much as a scrap of toilet paper given the extremely strong economic incentives OpenAI has to exaggerate the capabilities of their software.

[1] If their experiment even demonstrates what it purports to demonstrate. For anyone to give this article any credence, the exploit really needs to be independently verified that it is what they say it is and that it was achieved the way they say it was achieved.

4 more replies

rwmj3mo ago

With the exploits, you can try them and they either work or they don't. An attacker is not especially interested in analysing why the successful ones work.

With the CVE reports some poor maintainer has to go through and triage them, which is far more work, and very asymmetrical because the reporters can generate their spam reports in volume while each one requires detailed analysis.

SchemaLoad3mo ago

There's been several notable posts where maintainers found there was no bug at all, or the example code did not even call code from their project and had just found running a python script can do things on your computer. Entirely AI generated Issue reports and examples wasting maintainer time.

3 more replies

airza3mo ago

All the attackers I’ve known are extremely, pathologically interested in understanding why their exploits work.

2 more replies

0xDEAFBEAD3mo ago

It can't be too long before Claude Code is capable of replication + triage + suggested fixes...

2 more replies

wat100003mo ago

LLMs produce good output and bad output. The trick is figuring out which is which. They excel at tasks where good output is easily distinguished. For example, I've had a lot of success with making small reproducers for bugs. I see weird behavior A coming from giant pile of code B, figure out how to trigger A in a small example. It can often do so, and when it gets it wrong it's easy to detect because its example doesn't actually do A. The people sending useless bug reports aren't checking for good output.

raesene93mo ago

Yeah they definitely can be true (IME), as there's a massive difference depending on how LLMs are used to the quality of the output.

For example if you just ask an LLM in a browser with no tool use to "find a vulnerability in this program", it'll likely give you something but it is very likely to be hallucinated or irrelevant.

However if you use the same LLM model via an agent, and provide it with concrete guidance on how to test its success, and the environment needed to prove that success, you are much more likely to get a good result.

It's like with Claude code, if you don't provide a test environment it will often make mistakes in the coding and tell you all is well, but if you provide a testing loop it'll iterate till it actually works.

GoatInGrey3mo ago

Both are true. Exploits are a very narrow problem with unambiguous success metrics. While also naturally complementing the ingrained persistence of LLMs. Bug reports are much more fuzzy by comparison with open-ended goals that lead to the LLMs metaphorically cheating on their homework to satisfy the prompter who doesn't know any better.

QuadmasterXLII3mo ago

These exploits were costing $50 of API credit each. If you receive 5001 issues from $100 in API spend on bug hunting and one of the issues cost $50 and the other 5000 cost one cent each, and they’re all visually indistinguishable using perfect grammar and familiar cyber security lingo; hard to find the dianond.

tptacek3mo ago

The point of the post is that the harness generates a POC. It either works or it doesn't.

1 more reply

pjc503mo ago

Once your exploit machine is good enough, you can start using stolen credentials to mine more exploits. This is going to be the new version of malware installing bitcoin miners.

AdieuToLogic3mo ago

Both can be true if each group selectively provides LLM output supporting their position. Essentially, this situation can be thought of as a form of the Infinite Monkey Theorem[0] where the result space is drastically reduced from "purely random" to "likely to be statistically relevant."

For an interesting overview of the above theorem, see here[1].

0 - https://en.wikipedia.org/wiki/Infinite_monkey_theorem

1 - https://www.yalescientific.org/2025/04/sorry-shakespeare-why...

doomerhunter3mo ago

Both are true, the difference is the skill level of the people who use / create programs to coordinate LLMs to generate those reports.

The AI slop you see on curl's bug bounty program[1] (mostly) comes from people who are not hackers in the first place.

In the contrary persons like the author are obviously skilled in security research and will definitely send valid bugs.

Same can be said for people in my space who do build LLM-driven exploit development. In the US Xbow hired quite some skilled researchers [2] had some promising development for instance.

[1] https://hackerone.com/curl/hacktivity [2] https://xbow.com/about

tptacek3mo ago

If it helps, I read this (before it landed here) because Halvar Flake told everyone on Twitter to read it.

simonw3mo ago

I hadn't heard of Halvar Flake but evidently he's a well respected figure in security - https://ringzer0.training/advisory-board-thomas-dullien-halv... mentions "After working at Google Project Zero, he cofounded startup optimyze, which was acquired by Elastic Security in 2021"

His co-founder on optimyze was Sean Heelan, the author of the OP.

1 more reply

_factor3mo ago

Depends near entirely on the model being used. A bug report by Opus and a bug report from Gemma3 are not of the same caliber.

octoberfranklin3mo ago

Finished exploits (for immediate deployment) don't have to be maintainable, and they only need to work once.

ronsor3mo ago

LLMs are both extremely useful to competent developers and extremely harmful to those who aren't.

rvz3mo ago

Accurate.

baxtr3mo ago

> We should start assuming that in the near future the limiting factor on a state or group’s ability to develop exploits, break into networks, escalate privileges and remain in those networks, is going to be their token throughput over time, and not the number of hackers they employ.

Scary.

nottorp3mo ago

Heh. What is probably really happening is that those states or groups are having their "hackers" analyze common mistakes in vibe coded LLM output and writing by hand generic exploits for that...

viraptor3mo ago

I'm really confused by the sandbox part. The description kind of mentions it and the limited system syscall, but then just pivots to talking about the exit handlers. It may be just unclear writing, but now I'm suspicious of the whole thing. https://github.com/SeanHeelan/anamnesis-release/?tab=readme-... feels like the author lost track.

If forking is blocked, the exit handler can't do it either. If it's some variant of execve, the sandbox is preserved so we didn't gain much.

Edit: ok, I get it! Missed the "Goal: write exactly "PWNED" to /tmp/pwned". Which makes the sandbox part way less interesting as implemented. It's just saying you can't shell out to do it, but there's no sandbox breakout at any point in the exploit.

jdefr893mo ago

Yea, this entire repo/article seems super misleading to me. Not to mention asking it to generate API for OOB R/W primitives is essentially asking it to regurgitate what exists on thousands of github repos and CTF toolkits.

socketcluster3mo ago

The continuous lowering of entry barriers to software creation, combined with the continuous lowering of entry barriers to software hacking is an explosive combination.

We need new platforms which provide the necessary security guardrails, verifiability, simplicity of development, succinctness of logic (high feature/code ratio)... You can't trust non-technical vibe coders with today's software tools when they can't even trust themselves.

tosapple3mo ago

Why did you edit out the third paragraph about finding a single exploit on target being slanted against having to secure a whole system?

socketcluster3mo ago

What I said was true but after thinking about it a bit more, I wasn't sure how material it was to my argument after considering additional factors.

There are other nuances which may offset the asymmetry a bit; for example the security analyst generally has much more visibility over the company's code than the hacker does.

That said, I stand by my original point because I think that building secure systems is really hard; it's much more effort per unit of functionality to build the system correctly (and doing that for every part of it) than it is to crack it (by finding a single hole).

On the side of defense, you need to understand a lot of nuance about how your system works and how parts interact to make it secure; any neglect can potentially be a critical vulnerability which can compromise the entire system.

On the side of offense, sometimes mindless prodding can uncover a critical vulnerability. The intelligence/thinking requirement is lower; it's more about knowledge than thinking.

For example, there are some special payloads which you can send which may pose a problem for different systems built by different companies because the companies share the same underlying engine or they fell victim to the same footgun. I think this aspect is much more important than my previous argument.

dfajgljsldkjag3mo ago

I was under the impression that once you have a vulnerability with code execution, writing the actual payload to exploit it is the easy part. With tools like pentools and etc is fairly straightforward.

The interesting part is still finding new potential RCE vulnerabilities, and generally if you can demonstrate the vulnerability even without demonstrating an E2E pwn red teams and white hats will still get credit.

tptacek3mo ago

He's not starting from a vulnerability offering code execution; it's a memory corruption vulnerability (it's effectively a heap write).

frosting13373mo ago

It's as easy as drawing the rest of the owl, sure.

DeathArrow3mo ago

>Recently I ran an experiment where I built agents on top of Opus 4.5 and GPT-5.2 and then challenged them to write exploits for a zeroday vulnerability in the QuickJS Javascript interpreter.

I think the main challenge for hackers is to find 0day vulnerabilities, not writing the actual exploit code.

jdefr893mo ago

As someone who does it for a living the challenge can be in both. However this article is asking its agents to do CTF like challenges which I am sure the respective LLMs have seen millions of so it can essentially regurgitate a large part of the exploit code. This is especially true for the OOB/RW primitive API.

GaggiX3mo ago

The vulnerability was found by Claude:

>This is true by definition as the QuickJS vulnerability was previously unknown until I found it (or, more correctly: my Opus 4.5 vulnerability discovery agent found it).

ytrt54e3mo ago

Your personal data will become more important as time goes by... And you will need to have less trust in having multiple accounts with sensitive data stored [online shopping etc] as they just become vectors to attack.

f311a3mo ago

It’s not like you needed LLMs for quickjs which already had known and unpatched problems. It’s a toy project. It would be cool to see exploits for something like curl.

larodi3mo ago

two points -

1) it becomes increasingly more dangerous to dl stuff from the internet and just run it, even its opensource, given normally people don't read all of it. for weird repos I'd recomment to do automated analysis with opus 4.5 or the gpt 5.2 indeed.

2) if we assume adversaries are using LLMs to churn exploits 24/7, which we should absolutely do, perhaps the time where we turn the internet off whenever is not needed, is not far.

KellyCriterion3mo ago

...well, just dont download random stuff from the internet and run it on your important machines then? :-))

You are right: 30 years ago, it was safe to go to vendor XY page and download his latest version and it was more or less waterproof. Today with all these mirror sites, very often better SEO ranking than the original, its quite dangerous: In my former bank we had a colleague who installed a browser add-in that he used for years (at home and in the bank); then he got a new notebook, fresh browser, he installed the same extension - but from a different source than the original vendor: unfortunately, this version contained malware and a big transaction was caught by compliance in the very last second, because he wasnt aware of data leakage.

pnathan3mo ago

> 30 years ago, it was safe to go to vendor XY page and download his latest version and it was more or less waterproof.

You _are_ joking, right? I distinctly remember all sorts of dubious freewarez sites with slightly modified installers. 1997-2000 era. And anti-virus was a thing in MS-DOS even.

1 more reply

larodi3mo ago

well, how about all those Show HN repos? just don't download them or what?

pnathan3mo ago

I am working on a little project in my offhours, and asked a non-hacker (but competent programmer) friend to take a run at exploiting it. Great success: my project was successfully exploited.

The industrialization of exploit generation is here IMO.

idiotsecant3mo ago

It's tempting to say that malware protection needs to be LLM based as well, but it's unlikely that on-machine malware defense can ever match the resources that would be trivially available to attackers.

ironbound3mo ago

reverse engineering code is still pretty average, I'm fare limited in attention and time but LLM are not pulling their weight in this area today, be it compounding errors or in context failures.

anabis3mo ago

I wonder if later challenges would be cheaper if summary of lesser challenges and solutions were also provided? Building up difficulty.

erichocean3mo ago

The reverse is also true: secure code is difficult to write, and LLMs at scale will make it much easier to develop secure code.

JohnLeitch3mo ago

This is interesting, but in most cases the challenge is finding a truly exploitable bug. If LLMs can get to the point where they can analyze a codebase and identify vulnerabilities, we're going to see some shit. But as of right now, this looks like a medium-to-low complexity bug that any competent exploit developer could work with easily.

GaggiX3mo ago

The NSO Group going to spawn 10k Claude Code instances now.

saagarjha3mo ago

Now?

pianopatrick3mo ago

I would not be shocked to learn that intelligence agencies are using AI tools to hack back into AI companies that make those tools to figure out how to create their own copycat AI.

jjmarr3mo ago

I would be shocked if intelligence agencies, being government bodies, have anything better than GitHub Copilot.

octoberfranklin3mo ago

They had Google Earth long before Google did...

kiririn73mo ago

i doubt they are competent enough to match what private companies are doing

_carbyau_3mo ago

My take away: apparently Cyberpunk Hackers of the dystopian future cruising through the virtual world will use GPT-5.2-or-greater as their "attack program" to break the "ICE" (Intrusion Countermeasures Electronics, not the currently politically charged term...).

I still doubt they will hook up their brains though.

j / k navigate · click thread line to collapse

170 comments

simonw3mo ago

Yikes.

ahartmetz3mo ago

Probabilistic mitigations work against probabilistic attacks, I guess - but exploit writers aren't random, they are directed, and they find the weaknesses.

GaggiX3mo ago

The vulnerability was found by Opus:

"This is true by definition as the QuickJS vulnerability was previously unknown until I found it (or, more correctly: my Opus 4.5 vulnerability discovery agent found it)."

2 more replies

staticassertion3mo ago

1 more reply

titzer3mo ago

shakna3mo ago

Escaping the sandbox has been plenty doable over the years. [0]

WASM adds a layer, but the first thing anyone will do is look for a way to escape it. And unless all software faults and hardware faults magically disappear, it'll still be a constant source of bugs.

Pitching a sandbox against ingenuity will always fail at some point, there is no panacea.

[0] https://instatunnel.substack.com/p/the-wasm-breach-escaping-...

verall3mo ago

> In the future we'll question why we didn't move to WASM as the universal executable format sooner

I hope not, my laptop is slow enough as it is.

rvz3mo ago

> Leak a libc Pointer via Use-After-Free. The exploit uses the vulnerability to leak a pointer to libc.

I doubt Rust would save you here unless the binary has very limited calls to libc, but would be much harder for a UaF to happen in Rust code.

cookiengineer3mo ago

The reason I value Go so much is because you have a fat dependency free binary that's just a bunch of syscalls when you use CGO_ENABLED=0.

Combine that with a minimal docker container and you don't even need a shell or anything but the kernel in those images.

2 more replies

pizlonator3mo ago

Yeah Fil-C to the rescue

(I’m not trying to be facetious or troll or whatever. Stuff like this is what motivated me to do it.)

tptacek3mo ago

"C executables" are most of the frontier of exploit development, which is why this is a meaningful model problem.

1 more reply

lelanthran3mo ago

Wouldn't GP's approach work with any other executable using libc? Python, Node, Rust, etc?

I fail to see what is specific to either C or QuickJS in the GP's approach.

vsgherzi3mo ago

Wouldn’t the idea be to not have the uaf to begin with? I’d argue it saves you very much by making the uaf way harder to write. Forcing unsafe and such.

cookiengineer3mo ago

> glibc's exit handler

> Yikes.

Yep.

arthurcolle3mo ago

Life, uh, finds a way

1 more reply

jdefr893mo ago

Most modern kill chains involve chaining together that many bugs... I know because it's my job and its become demoralizing.

catoc3mo ago

So much for ‘stochastic parrots’

moron4hire3mo ago

> The exploits generated do not demonstrate novel, generic breaks in any of the protection mechanisms.

1 more reply

saagarjha3mo ago

rramadass3mo ago

> formal modeling of exploits, which is currently a very immature field.

Can you elaborate more on this with pointers to some resources?

saagarjha3mo ago

I think a lot of work that went into mitigating Spectre has been a good example since it’s very easy to patch incorrectly if you don’t have a good model of the vulnerability and what it allows

er4hn3mo ago

digdugdirk3mo ago

The defensive side needs everything to go right, all the time. The offensive side only needs something to go wrong once.

Vetch3mo ago

psychoslave3mo ago

It's probably more worrying as you get script kiddies on steroids which can spawn all around with same mindset as even the dumbest significant geopolitical actor out there.

NitpickLawyer3mo ago

> These tools feel symmetric for defenders to use as well.

In real life that's why you sometimes see titles like "teenager hacks into multi-billion dollar company and installs crypto malware".

I do think that you're right in that we'll see improved security stance by using red v. blue agents "in a loop". But I also think that red has a mathematical advantage here.

rightbyte3mo ago

>> These tools feel symmetric for defenders to use as well.

> I don't think so. From a pure mathematical standpoint, you'd need better (or equal) results at avg@1 or maj@x, while the attacker needs just pass@x to succeed.

Executing remote code is a choice not some sort of force of nature.

Timesharing systems are inherently not safe and way too much effort is put into claiming the stone from Sisyphus.

SaaS and complex centralized software need to go and that is way over due.

1 more reply

azakai3mo ago

Yes, and these tools are already being used defensively, e.g. in Google Big Sleep

https://projectzero.google/2024/10/from-naptime-to-big-sleep...

List of vulnerabilities found so far:

https://issuetracker.google.com/savedsearches/7155917

pizlonator3mo ago

Not symmetric at all.

There are countless bugs to fund.

If the offender runs these tools, then any bug they find becomes a cyberweapon.

If the defender runs these tools, they will not thwart the offender unless they find and fix all of the bugs.

Any vs all is not symmetric

energy1233mo ago

LLMs effectively move us from A to B:

A) 1 cyber security employee, 1 determined attacker

B) 100 cyber security employees, 100 determined attackers

Which is better for defender?

1 more reply

0xDEAFBEAD3mo ago

How do bug bounties change the calculus? Assuming rational white hats who will report every bug which costs fewer LLM tokens than the bounty, on expectation.

1 more reply

hackyhacky3mo ago

> I think the author makes some interesting points, but I'm not that worried about this.

Given the large number of unmaintained or non-recent software out there, I think being worried is the right approach.

The only guaranteed winner is the LLM companies, who get to sell tokens to both sides.

pixl973mo ago

I mean you're leaving out large nation state entities

0xbadcafebee3mo ago

SchemaLoad3mo ago

amelius3mo ago

> These tools feel symmetric for defenders to use as well.

Why? The attackers can run the defending software as well. As such they can test millions of testcases, and if one breaks through the defenses they can make it go live.

er4hn3mo ago

execveat3mo ago

I'm quite optimistic about AI ultimately making systems more secure and well protected, shifting the overall balance towards the defenders.

lateral_cloud3mo ago

Defenders have the added complexity of operating within business constraints like CAB/change control and uptime requirements. Threat actors don’t, so they can move quick and operate at scale.

bandrami3mo ago

For that matter is this in principle much different from a fuzzer?

nl3mo ago

One of the interesting things to me about this is that Codex 5.2 found the most complex of the exploits.

The reflects my experience too. Opus 4.5 is my everyday driver - I like using it. But Codex 5.2 with Extra High thinking is just a bit more powerful.

conception3mo ago

nl3mo ago

Disagree about Codex - it's great at doing things too!

Gemini either does a Monet or demolishes your bathroom and builds a new tuna fishing boat there instead, and it is completely random which one you get.

It's a great model but I rarely use it because it's so random as to what you get.

1 more reply

cellis3mo ago

saagarjha3mo ago

I solve plenty of hard problems as a hobby

nitwit0053mo ago

jdefr893mo ago

LeakedCanary3mo ago

protocolture3mo ago

simonw3mo ago

Why can't they both be true?

The quality of output you see from any LLM system is filtered through the human who acts on those results.

protocolture3mo ago

Theres no filtering mentioned in the OP article. It claims GPT only created working useful exploits. If it can do that, it could also submit those exploits as perfectly as bug reports?

2 more replies

anonymous9082133mo ago

They can't both be true if we're talking about the premise of the article, which is the subject of the headline and expounded upon prominently in the body:

  The Industrialisation of Intrusion

  By ‘industrialisation’ I mean that the ability of an organisation to complete a task will be limited by the number of tokens they can throw at that task. In order for a task to be ‘industrialised’ in this way it needs two things:

  An LLM-based agent must be able to search the solution space. It must have an environment in which to operate, appropriate tools, and not require human assistance. The ability to do true ‘search’, and cover more of the solution space as more tokens are spent also requires some baseline capability from the model to process information, react to it, and make sensible decisions that move the search forward. It looks like Opus 4.5 and GPT-5.2 possess this in my experiments. It will be interesting to see how they do against a much larger space, like v8 or Firefox.
  The agent must have some way to verify its solution. The verifier needs to be accurate, fast and again not involve a human.

4 more replies

rwmj3mo ago

With the exploits, you can try them and they either work or they don't. An attacker is not especially interested in analysing why the successful ones work.

SchemaLoad3mo ago

3 more replies

airza3mo ago

All the attackers I’ve known are extremely, pathologically interested in understanding why their exploits work.

2 more replies

0xDEAFBEAD3mo ago

It can't be too long before Claude Code is capable of replication + triage + suggested fixes...

2 more replies

wat100003mo ago

raesene93mo ago

Yeah they definitely can be true (IME), as there's a massive difference depending on how LLMs are used to the quality of the output.

For example if you just ask an LLM in a browser with no tool use to "find a vulnerability in this program", it'll likely give you something but it is very likely to be hallucinated or irrelevant.

GoatInGrey3mo ago

QuadmasterXLII3mo ago

tptacek3mo ago

The point of the post is that the harness generates a POC. It either works or it doesn't.

1 more reply

pjc503mo ago

Once your exploit machine is good enough, you can start using stolen credentials to mine more exploits. This is going to be the new version of malware installing bitcoin miners.

AdieuToLogic3mo ago

For an interesting overview of the above theorem, see here[1].

0 - https://en.wikipedia.org/wiki/Infinite_monkey_theorem

1 - https://www.yalescientific.org/2025/04/sorry-shakespeare-why...

doomerhunter3mo ago

Both are true, the difference is the skill level of the people who use / create programs to coordinate LLMs to generate those reports.

The AI slop you see on curl's bug bounty program[1] (mostly) comes from people who are not hackers in the first place.

In the contrary persons like the author are obviously skilled in security research and will definitely send valid bugs.

Same can be said for people in my space who do build LLM-driven exploit development. In the US Xbow hired quite some skilled researchers [2] had some promising development for instance.

[1] https://hackerone.com/curl/hacktivity [2] https://xbow.com/about

tptacek3mo ago

If it helps, I read this (before it landed here) because Halvar Flake told everyone on Twitter to read it.

simonw3mo ago

His co-founder on optimyze was Sean Heelan, the author of the OP.

1 more reply

_factor3mo ago

Depends near entirely on the model being used. A bug report by Opus and a bug report from Gemma3 are not of the same caliber.

octoberfranklin3mo ago

Finished exploits (for immediate deployment) don't have to be maintainable, and they only need to work once.

ronsor3mo ago

LLMs are both extremely useful to competent developers and extremely harmful to those who aren't.

rvz3mo ago

Accurate.

baxtr3mo ago

Scary.

nottorp3mo ago

Heh. What is probably really happening is that those states or groups are having their "hackers" analyze common mistakes in vibe coded LLM output and writing by hand generic exploits for that...

viraptor3mo ago

If forking is blocked, the exit handler can't do it either. If it's some variant of execve, the sandbox is preserved so we didn't gain much.

jdefr893mo ago

socketcluster3mo ago

The continuous lowering of entry barriers to software creation, combined with the continuous lowering of entry barriers to software hacking is an explosive combination.

tosapple3mo ago

Why did you edit out the third paragraph about finding a single exploit on target being slanted against having to secure a whole system?

socketcluster3mo ago

What I said was true but after thinking about it a bit more, I wasn't sure how material it was to my argument after considering additional factors.

There are other nuances which may offset the asymmetry a bit; for example the security analyst generally has much more visibility over the company's code than the hacker does.

On the side of offense, sometimes mindless prodding can uncover a critical vulnerability. The intelligence/thinking requirement is lower; it's more about knowledge than thinking.

dfajgljsldkjag3mo ago

tptacek3mo ago

He's not starting from a vulnerability offering code execution; it's a memory corruption vulnerability (it's effectively a heap write).

frosting13373mo ago

It's as easy as drawing the rest of the owl, sure.

DeathArrow3mo ago

>Recently I ran an experiment where I built agents on top of Opus 4.5 and GPT-5.2 and then challenged them to write exploits for a zeroday vulnerability in the QuickJS Javascript interpreter.

I think the main challenge for hackers is to find 0day vulnerabilities, not writing the actual exploit code.

jdefr893mo ago

GaggiX3mo ago

The vulnerability was found by Claude:

>This is true by definition as the QuickJS vulnerability was previously unknown until I found it (or, more correctly: my Opus 4.5 vulnerability discovery agent found it).

ytrt54e3mo ago

f311a3mo ago

It’s not like you needed LLMs for quickjs which already had known and unpatched problems. It’s a toy project. It would be cool to see exploits for something like curl.

larodi3mo ago

two points -

2) if we assume adversaries are using LLMs to churn exploits 24/7, which we should absolutely do, perhaps the time where we turn the internet off whenever is not needed, is not far.

KellyCriterion3mo ago

...well, just dont download random stuff from the internet and run it on your important machines then? :-))

pnathan3mo ago

> 30 years ago, it was safe to go to vendor XY page and download his latest version and it was more or less waterproof.

You _are_ joking, right? I distinctly remember all sorts of dubious freewarez sites with slightly modified installers. 1997-2000 era. And anti-virus was a thing in MS-DOS even.

1 more reply

larodi3mo ago

well, how about all those Show HN repos? just don't download them or what?

pnathan3mo ago

I am working on a little project in my offhours, and asked a non-hacker (but competent programmer) friend to take a run at exploiting it. Great success: my project was successfully exploited.

The industrialization of exploit generation is here IMO.

idiotsecant3mo ago

ironbound3mo ago

reverse engineering code is still pretty average, I'm fare limited in attention and time but LLM are not pulling their weight in this area today, be it compounding errors or in context failures.

anabis3mo ago

I wonder if later challenges would be cheaper if summary of lesser challenges and solutions were also provided? Building up difficulty.

erichocean3mo ago

The reverse is also true: secure code is difficult to write, and LLMs at scale will make it much easier to develop secure code.

JohnLeitch3mo ago

GaggiX3mo ago

The NSO Group going to spawn 10k Claude Code instances now.

saagarjha3mo ago

Now?

pianopatrick3mo ago

I would not be shocked to learn that intelligence agencies are using AI tools to hack back into AI companies that make those tools to figure out how to create their own copycat AI.

jjmarr3mo ago

I would be shocked if intelligence agencies, being government bodies, have anything better than GitHub Copilot.

octoberfranklin3mo ago

They had Google Earth long before Google did...

kiririn73mo ago

i doubt they are competent enough to match what private companies are doing

_carbyau_3mo ago

I still doubt they will hook up their brains though.

j / k navigate · click thread line to collapse