When optimizing code it's not unusual to look at the assembly. It's not unusual to look for opportunities for autovectorization or to verify inlining or loop unrolling.
Compilers are, for the most part, deterministic. This means after people have reviewed the output, it's unlikely to change. It also means if they do change, only a few people are required to notice.
None of this applies to LLMs. They are worse than compilers, in regards to the quality and characteristics of their output, in every possible way.
If no one reviewed compiler output then https://godbolt.org/ wouldn't exist.
This is a complete misunderstanding of what makes compilers trustworthy. Those are all properties of the language, not the compiler. The compiler is trustworthy to the extent that it is well built, internally. It is trustworthy to the extent that the mapping from source code to machine code is well defined, and implemented correctly.
You can have the best type system you want, but if the compiler is badly implemented, it won't be trustworthy. A perfect example is C - a language that barely has a type system, yet has some of the most trustworthy and optimized compilers. And it also has, or at least had, plenty of buggy compilers, typically for small embedded platforms with complicated mappings between C constructs and the limited CPU instruction set.
Just yesterday I've reported a codegen bug in MSVC. (Luckily they've fixed it very fast.) Can you realise that it's an optimiser bug without inspecting the assembly? Hardly.
All the arguments people claim against LLMs are similarly applicable to compilers, but compilers are old technology and LLMs are new.
If you're an expert, just about every compiled function contains obvious inefficiencies, and a skilled assembly programmer can speed it up by in the ballpark of 3x. If we're talking about your average webapp, you can usually get 1000x better resource usage in most ways, including CPU, RAM, storage and so on.
And the output isn't deterministic either - the bugs no withstanding, code generation is highly chaotic, optimisations have non-local impacts and you can't easily predict optimised codegen output from source.
LLMs aren't much worse. They have non-deterministic output, but you can steer it - similarly to a compiler. An expert can use it to gain great speed and efficiency, but in the hands of someone not as capable, you can make something awful just as fast. Both tools are force multipliers.
Formal specification layers that agents execute against, not just prompts
"""It is probably easier to just write that program.
“AI-checks-AI pipelines as first-class CI infrastructure, not bolt-on curiosity”—what’s the contrast here? Is it serious aspiration, not unserious aspiration?
“Formal specification layers that agents execute against, not just prompts”—Okay.
It just looks like it is stating lots of problems with a x-not-y as if there is progress being made by way of insistence.
I am open to the idea of something like a small verification kernel that can be comprehended by “humans” which can check GenAI output. But right now we can contrast mature (decade+) compilers with GenAI like this.
- Compilers: You get the abstraction you asked for: it might not be “optimal” code, but it is code that works the way you wrote it
- GenAI: Here is 200KLOC, good luck, could be anything
Now you could reduce the space of those 200KLOC with tests and verification. But so far (based on this submission) it looks like this is at the handwaving stage.
Certainly you would need high-value tests if tests are the thing that is supposed to be the verification. Either something simple and expressive enough for “humans” to write or something that is both short and easy to read for “humans” (and generated by GenAI). Not some copy-paste smelling mockfest that looks like it is a pile of junk that has evolved over five years, each author pushing some junk on top while taking care to not make the whole pile tilt and collapse.
Lets indeed treat non-deterministic output exactly like we treat deterministic output.
https://github.com/llvm/llvm-project/tree/main/llvm/test/Cod...