story

The long tail of LLM-assisted decompilation (opens in new tab)

blog.chrislewis.au

81 pointsknackers2mo ago30 comments

30 comments

Claude is doing the decompilation here, right? Has this been compared against using a traditional decompiler with Claude in the loop to improve decompilation and ensure matched results? I would think that Claude’s training data would include a lot more pseudo-C <-> C knowledge than MIPS assembler from GCC 2.7 and C pairs, and even if the traditional decompiler was kind of bad at N64 it would be more efficient to fix bad decompiler C than assembler.

titzer2mo ago

It's wild to me that they wouldn't try this first. Feeding the asm directly into the model seems like intentionally ignoring a huge amount of work that has gone in traditional decompilation. What LLMs excel at (names, context, searching in high-dimensional space, making shit up) is very different from, e.g. coming up with an actual AST with infix expressions that represents asm code.

skerit2mo ago

I've been doing some decompilation with Ghidra. Unfortunately, it's of a C++ game, which Ghidra isn't really great at. And thus Claude gets a bit confused about it all too. But all in all: it does work, and I've been able to reconstruct a ton of things already.

sestep2mo ago

One of the other PhD students in my department has an NDSS 2026 paper about combining the strengths of both LLMs and traditional decompilers! https://lukedramko.github.io/files/idioms.pdf

suprjami2mo ago

Not Claude, but there are open-weight LLMs trained specifically on Ghidra decomp and tested on their ability to help reverse engineers make sense of it:

https://huggingface.co/LLM4Binary/llm4decompile-22b-v2

There's also a dataset floating around HF which is... I think a popular N64 decomp to pseudo-C? Maybe the Mario one?

decidu0us90342mo ago

"Claude struggles with large functions and more or less gives up immediately on those exceeding 1,000 instructions." Well, yeah, that's the thing, an n64 game, that's C targetting an architecture where compiler optimizations are typically lacking, the idomatic style is lots of small tightly-scoped functions and the system architecture itself is a lot simpler than say a modern amd64 pc... These days I often just feel like, why is this person telling me how easy my job is now when they seemingly don't know much about it. I just find it arrogant and insulting... Perpetually demo season.

OptionOfT2mo ago

I'm really excited about this, especially for games for which the source code was lost like Red Alert 2.

qingcharles2mo ago

Me too. I'm going to be reverse-engineering Elite PC (original version) and I can't help but think the source is lost. The developer seems to have totally dropped off the face of the Earth. I've contacted others who might know and nobody knows where they are.

Even the game I was a developer on which was published by Eidos in ~1998 is probably lost source. I can't think that anyone has the Visual Source Safe database backup CDs lying around, but I could be wrong.

lstodd2mo ago

You mean 1991 Elite Plus? The whole series has been reverse-engineered to death and back. Maybe you mean some other game?

Anyway, for those old titles I don't think not having source is that much of a problem. I participated in two reimplementations of 1994 XCOM : UFO2000 and OpenXcom, helped the 1oom project (first Master of Orion) and I don't think having original source would have helped much.

1 more reply

the_biot2mo ago

I was, until I read this article. What a bunch of bullshit.

sureglymop2mo ago

Here's an interesting thing. I decided to do advent of code in assembly last year. What I noticed is that there must be a lot of code and binaries in AI training data but not a lot of intermediate representation. Be it LLVM IR, assembly or other forms of IR, it seems underrepresented. LLMs kept trying to give me code patterns that would make sense for high level code but not really for assembly because by hand one could find much more optimized solutions there.

But coincidentally this seems like an easy win for generated training data. Take all your code and have a compiler spit out assembly as well as binary. Now your LLM will not only be able to be a compiler but also make that useful and understandable by humans.

foxtacles2mo ago

I wonder how effective LLMs are going to be for decompiling i.e. games written in C++ targeting the PC platform. I’m not surprised one can get reasonably good results for N64 games, which have always been the easiest to reverse for a number of reasons.

amelius2mo ago

Does this technique limit the LLM to correctness-preserving transforms?

measurablefunc2mo ago

Like all things related to LLMs, semantic correctness is left as an exercise for the reader.

seddonm12mo ago

I delivered a talk at Rust Sydney about this exact topic last week:

https://reorchestrate.com/posts/your-binary-is-no-longer-saf...

I am able to translate multi-thousand line c functions - and reproduce bug-for-bug implementation

1 more reply

nemo16182mo ago

IMO this is one of the best use cases for AI today. Each function is like a separate mini problem with an explicit, easy-to-verify solution, and the goal is (essentially) to output text that resembles what humans write -- specifically, C code, which the models have obviously seen a lot of. And no one is harmed by this use of AI; no one's job is being taken. It's just automating an enormous amount of grunt work that was previously impossible to automate.

I'm part of the effort to decompile Super Smash Bros. Melee, and a fellow contributor recently wrote about how we're doing agent-based decompilation: https://stephenjayakar.com/posts/magic-decomp/

qingcharles2mo ago

And the renaming of all the variables from the auto-gen ones into something human readable was always a thankless task which LLMs are really good for.

m4632mo ago

> And no one is harmed by this use of AI; no one's job is being taken

what about: see cool app, decompile it, launch competing app.

(repeat)

_aavaa_2mo ago

Decompiling seems like the hard way to go here. Lots of clones pop up for popular games and apps all the time. I don't think you need to go down the decompile route to achieve that.

roelljr2mo ago

If you turn this into a benchmark, it will be solved in no time :)

macabeus2mo ago

I'm developing a pipeline runner for matching decompilation: https://github.com/macabeus/mizuchi

The initial motivation is to run benchmarks, though the foundation is flexible and can support many other use cases over time.

It's already proving useful. For example, I can run a benchmark, view the results in a dashboard, and even feed the report into Claude Code to answer questions like: "How did changing X affect the results?" or "What could be improved in the next run?"

GaggiX2mo ago

Curating a benchmark for reverse engineering functions doesn't seem a bad idea actually

j / k navigate · click thread line to collapse

30 comments

bri3d2mo ago

titzer2mo ago

skerit2mo ago

sestep2mo ago

One of the other PhD students in my department has an NDSS 2026 paper about combining the strengths of both LLMs and traditional decompilers! https://lukedramko.github.io/files/idioms.pdf

suprjami2mo ago

Not Claude, but there are open-weight LLMs trained specifically on Ghidra decomp and tested on their ability to help reverse engineers make sense of it:

https://huggingface.co/LLM4Binary/llm4decompile-22b-v2

There's also a dataset floating around HF which is... I think a popular N64 decomp to pseudo-C? Maybe the Mario one?

decidu0us90342mo ago

OptionOfT2mo ago

I'm really excited about this, especially for games for which the source code was lost like Red Alert 2.

qingcharles2mo ago

lstodd2mo ago

You mean 1991 Elite Plus? The whole series has been reverse-engineered to death and back. Maybe you mean some other game?

1 more reply

the_biot2mo ago

I was, until I read this article. What a bunch of bullshit.

sureglymop2mo ago

foxtacles2mo ago

amelius2mo ago

Does this technique limit the LLM to correctness-preserving transforms?

measurablefunc2mo ago

Like all things related to LLMs, semantic correctness is left as an exercise for the reader.

seddonm12mo ago

I delivered a talk at Rust Sydney about this exact topic last week:

https://reorchestrate.com/posts/your-binary-is-no-longer-saf...

I am able to translate multi-thousand line c functions - and reproduce bug-for-bug implementation

1 more reply

nemo16182mo ago

I'm part of the effort to decompile Super Smash Bros. Melee, and a fellow contributor recently wrote about how we're doing agent-based decompilation: https://stephenjayakar.com/posts/magic-decomp/

qingcharles2mo ago

And the renaming of all the variables from the auto-gen ones into something human readable was always a thankless task which LLMs are really good for.

m4632mo ago

> And no one is harmed by this use of AI; no one's job is being taken

what about: see cool app, decompile it, launch competing app.

(repeat)

_aavaa_2mo ago

Decompiling seems like the hard way to go here. Lots of clones pop up for popular games and apps all the time. I don't think you need to go down the decompile route to achieve that.

roelljr2mo ago

If you turn this into a benchmark, it will be solved in no time :)

macabeus2mo ago

I'm developing a pipeline runner for matching decompilation: https://github.com/macabeus/mizuchi

The initial motivation is to run benchmarks, though the foundation is flexible and can support many other use cases over time.

GaggiX2mo ago

Curating a benchmark for reverse engineering functions doesn't seem a bad idea actually

j / k navigate · click thread line to collapse