Nobody Reviews Compiler Output (opens in new tab)

(skiplabs.io)

10 pointsrzk5d ago14 comments

14 comments

All other arguments aside... Yes, people do review compiler output, all the time in fact!

When optimizing code it's not unusual to look at the assembly. It's not unusual to look for opportunities for autovectorization or to verify inlining or loop unrolling.

Compilers are, for the most part, deterministic. This means after people have reviewed the output, it's unlikely to change. It also means if they do change, only a few people are required to notice.

None of this applies to LLMs. They are worse than compilers, in regards to the quality and characteristics of their output, in every possible way.

If no one reviewed compiler output then https://godbolt.org/ wouldn't exist.

1 more reply

tsimionescu3d ago

> Compilers have type systems, formal contracts about what code means before it runs.

This is a complete misunderstanding of what makes compilers trustworthy. Those are all properties of the language, not the compiler. The compiler is trustworthy to the extent that it is well built, internally. It is trustworthy to the extent that the mapping from source code to machine code is well defined, and implemented correctly.

You can have the best type system you want, but if the compiler is badly implemented, it won't be trustworthy. A perfect example is C - a language that barely has a type system, yet has some of the most trustworthy and optimized compilers. And it also has, or at least had, plenty of buggy compilers, typically for small embedded platforms with complicated mappings between C constructs and the limited CPU instruction set.

fuhsnn3d ago

Some of us do spend hours on godbolt.org tweaking code like it was game character build.

Pannoniae3d ago

Just like LLMs, compilers are just another layer of abstraction and no they're not deterministic.

Just yesterday I've reported a codegen bug in MSVC. (Luckily they've fixed it very fast.) Can you realise that it's an optimiser bug without inspecting the assembly? Hardly.

All the arguments people claim against LLMs are similarly applicable to compilers, but compilers are old technology and LLMs are new.

If you're an expert, just about every compiled function contains obvious inefficiencies, and a skilled assembly programmer can speed it up by in the ballpark of 3x. If we're talking about your average webapp, you can usually get 1000x better resource usage in most ways, including CPU, RAM, storage and so on.

And the output isn't deterministic either - the bugs no withstanding, code generation is highly chaotic, optimisations have non-local impacts and you can't easily predict optimised codegen output from source.

LLMs aren't much worse. They have non-deterministic output, but you can steer it - similarly to a compiler. An expert can use it to gain great speed and efficiency, but in the hands of someone not as capable, you can make something awful just as fast. Both tools are force multipliers.

zby3d ago

""" we need to build:

    Formal specification layers that agents execute against, not just prompts

"""

It is probably easier to just write that program.

3 more replies

keybored3d ago

I only skimmed this. Lots of “not to be read, but to be verified; process, not the artifact; not x but...”.

“AI-checks-AI pipelines as first-class CI infrastructure, not bolt-on curiosity”—what’s the contrast here? Is it serious aspiration, not unserious aspiration?

“Formal specification layers that agents execute against, not just prompts”—Okay.

It just looks like it is stating lots of problems with a x-not-y as if there is progress being made by way of insistence.

I am open to the idea of something like a small verification kernel that can be comprehended by “humans” which can check GenAI output. But right now we can contrast mature (decade+) compilers with GenAI like this.

- Compilers: You get the abstraction you asked for: it might not be “optimal” code, but it is code that works the way you wrote it

- GenAI: Here is 200KLOC, good luck, could be anything

Now you could reduce the space of those 200KLOC with tests and verification. But so far (based on this submission) it looks like this is at the handwaving stage.

Certainly you would need high-value tests if tests are the thing that is supposed to be the verification. Either something simple and expressive enough for “humans” to write or something that is both short and easy to read for “humans” (and generated by GenAI). Not some copy-paste smelling mockfest that looks like it is a pile of junk that has evolved over five years, each author pushing some junk on top while taking care to not make the whole pile tilt and collapse.

secos3d ago

ah yes.

Lets indeed treat non-deterministic output exactly like we treat deterministic output.

1 more reply

mathisfun1233d ago

this take is peak dunning-kruger:

https://github.com/llvm/llvm-project/tree/main/llvm/test/Cod...

j / k navigate · click thread line to collapse

14 comments

xyzzy_plugh3d ago

All other arguments aside... Yes, people do review compiler output, all the time in fact!

When optimizing code it's not unusual to look at the assembly. It's not unusual to look for opportunities for autovectorization or to verify inlining or loop unrolling.

Compilers are, for the most part, deterministic. This means after people have reviewed the output, it's unlikely to change. It also means if they do change, only a few people are required to notice.

None of this applies to LLMs. They are worse than compilers, in regards to the quality and characteristics of their output, in every possible way.

If no one reviewed compiler output then https://godbolt.org/ wouldn't exist.

1 more reply

tsimionescu3d ago

> Compilers have type systems, formal contracts about what code means before it runs.

fuhsnn3d ago

Some of us do spend hours on godbolt.org tweaking code like it was game character build.

Pannoniae3d ago

Just like LLMs, compilers are just another layer of abstraction and no they're not deterministic.

Just yesterday I've reported a codegen bug in MSVC. (Luckily they've fixed it very fast.) Can you realise that it's an optimiser bug without inspecting the assembly? Hardly.

All the arguments people claim against LLMs are similarly applicable to compilers, but compilers are old technology and LLMs are new.

zby3d ago

""" we need to build:

    Formal specification layers that agents execute against, not just prompts

"""

It is probably easier to just write that program.

3 more replies

keybored3d ago

I only skimmed this. Lots of “not to be read, but to be verified; process, not the artifact; not x but...”.

“AI-checks-AI pipelines as first-class CI infrastructure, not bolt-on curiosity”—what’s the contrast here? Is it serious aspiration, not unserious aspiration?

“Formal specification layers that agents execute against, not just prompts”—Okay.

It just looks like it is stating lots of problems with a x-not-y as if there is progress being made by way of insistence.

- Compilers: You get the abstraction you asked for: it might not be “optimal” code, but it is code that works the way you wrote it

- GenAI: Here is 200KLOC, good luck, could be anything

Now you could reduce the space of those 200KLOC with tests and verification. But so far (based on this submission) it looks like this is at the handwaving stage.

secos3d ago

ah yes.

Lets indeed treat non-deterministic output exactly like we treat deterministic output.

1 more reply

mathisfun1233d ago

this take is peak dunning-kruger:

https://github.com/llvm/llvm-project/tree/main/llvm/test/Cod...

j / k navigate · click thread line to collapse