undefined | Better HN

0 pointstptacek2mo ago0 comments

Yes, and it apparently burns lots of tokens. But what I've heard is that the outcomes are drastically less expensive than hand-reversing was, when you account for labor costs.

0 comments

12 comments · 2 top-level

jeffmcjunkin2mo ago· 8 in thread

Can confirm. Matching decompilation in particular (where you match the compiler along with your guess at source, compile, then compare assembly, repeating if it doesn't match) is very token-intensive, but it's now very viable: https://news.ycombinator.com/item?id=46080498

Of course LLMs see a lot more source-assembly pairs than even skilled reverse engineers, so this makes sense. Any area where you can get unlimited training data is one we expect to see top-tier performance from LLMs.

(also, hi Thomas!)

stackghost2mo ago

My own experience has been that "ghidra -> ask LLM to reason about ghidra decompilation" is very effective on all but the most highly obfuscated binaries.

Burning tokens by asking the LLM to compile, disassemble, compare assembly, recompile, repeat seems very wasteful and inefficient to me.

mikestaas2mo ago

LaurieWired did a good episode about that kind of thing https://www.youtube.com/watch?v=u2vQapLAW88

kimixa2mo ago

That matches my experience too - LLMs are very capable in "translating" between domains - one of the best experience I've had with LLMs is turning "decompiled" source into "human readable" source. I don't think that "Binary Only" closed-source isn't the defense against this that some people here seem to think it is.

echelon2mo ago

Has anyone used an LLM to deobfuscate compiled Javascript?

lelanthran2mo ago

> Has anyone used an LLM to deobfuscate compiled Javascript?

Seems like a waste of money; wouldn't it be better to extract the AST deterministically, write it out and only then ask an LLM to change those auto-generated symbol names with meaningful names?

heeen22mo ago

yes, but it requires some nudging if you don't want to waste tokens. it will happily grep and sed through massive javascript bundles but if you tell it to first create tooling like babel scripts to format, it will be much quicker.

1 more reply

saagarjha2mo ago

I've used it for hobby efforts on Electron/React Native (Hermes bytecode) apps and it seems to work reasonably well

bitexploder2mo ago

Yep. They are good at it.

gfosco2mo ago· 2 in thread

Yeah, it's token intensive but worth it. I built a very dumb example harness which used IDA via MCP and analyzed/renamed/commented all ~67k functions in a binary, using Claude Haiku for about $150. A local model could've accomplished it for much less/free. The knowledge base it outputs and the marked up IDA db are super valuable.

whattheheckheck2mo ago

Do you have the repo example?

heeen22mo ago

I did something similar using ghidramcp for digging around this keyboard firmware, repo contains the ghidra project, linux driver and even patches to the original stock fw. https://github.com/echtzeit-solutions/monsgeek-akko-linux

j / k navigate · click thread line to collapse

0 comments

12 comments · 2 top-level

jeffmcjunkin2mo ago· 8 in thread

(also, hi Thomas!)

stackghost2mo ago

My own experience has been that "ghidra -> ask LLM to reason about ghidra decompilation" is very effective on all but the most highly obfuscated binaries.

Burning tokens by asking the LLM to compile, disassemble, compare assembly, recompile, repeat seems very wasteful and inefficient to me.

mikestaas2mo ago

LaurieWired did a good episode about that kind of thing https://www.youtube.com/watch?v=u2vQapLAW88

kimixa2mo ago

echelon2mo ago

Has anyone used an LLM to deobfuscate compiled Javascript?

lelanthran2mo ago

> Has anyone used an LLM to deobfuscate compiled Javascript?

Seems like a waste of money; wouldn't it be better to extract the AST deterministically, write it out and only then ask an LLM to change those auto-generated symbol names with meaningful names?

heeen22mo ago

1 more reply

saagarjha2mo ago

I've used it for hobby efforts on Electron/React Native (Hermes bytecode) apps and it seems to work reasonably well

bitexploder2mo ago

Yep. They are good at it.

gfosco2mo ago· 2 in thread

whattheheckheck2mo ago

Do you have the repo example?

heeen22mo ago

j / k navigate · click thread line to collapse