undefined | Better HN

0 pointslocusofself16d ago0 comments

If you spend 5-15x the time reviewing what the LLM is doing, are you saving any time by using it?

0 comments

No, but that's the crux of the AI problem in software. Time to write code was never the bottleneck. AI is most useful for learning, either via conversation or by seeing examples. It makes writing code faster too, but only a little after you take into account review. The cases where it shines are high-profile and exciting to managers, but not common enough to make a big difference in practice. E.g AI can one-shot a script to get logs from a paginated API, convert it to ndjson, and save to files grouped by week, with minimal code review, but only if I'm already experienced enough to describe those requirements, and, most importantly, that's not what I'm doing every day anyway.

brandensilva16d ago

I'm finding it in some cases I'm dealing with even more code given how much code AI outputs. So yeah, for some tasks I find myself extremely fast but for others I find myself spending ungodly amounts of time reviewing the code I never wrote to make sure it doesn't destroy the project from unforseen convincing slop.

ritlo16d ago

A related Dirty Secret that's going to become clear from all this is that a very large proportion of code in the wild (yes, even in 2026—maybe not in FAANG and friends, IDK, but across all code that is written for pay in the entire economy) has limited or no automated test coverage, and is often being written with only a limited recorded spec that's usually fleshed out only to the degree needed (very partial) as a given feature is being worked on.

What do the relatively hands-off "it can do whole features at a time" coding systems need to function without taking up a shitload of time in reviews? Great automated test coverage, and extensive specs.

I think we're going to find there's very little time-savings to be had for most real-world software projects from heavy application of LLMs, because the time will just go into tests that wouldn't otherwise have been written, and much more detailed specs that otherwise never would have been generated. I guess the bright-side take of this is that we may end up with better-tested and better-specified software? Though so very much of the industry is used to skipping those parts, and especially the less-capable (so far as software goes) orgs that really need the help and the relative amateurs and non-software-professionals that some hope will be able to become extremely productive with these tools, that I'm not sure we'll manage to drag processes & practices to where they need to be to get the most out of LLM coding tools anyway. Especially if the benefit to companies is "you will have better tests for... about the same amount of software as you'd have written without LLMs".

We may end up stuck at "it's very-aggressive autocomplete" as far as LLMs' useful role in them, for most projects, indefinitely.

On the plus side for "AI" companies, low-code solutions are still big business even though they usually fail to deliver the benefits the buyer hopes for, so there's likely a good deal of money to be made selling companies LLM solutions that end up not really being all that great.

slopinthebag16d ago

Re. productivity, if LLM's are a genuine boost with 1/3 of the work, neutral 1/3 of the time, and actually worse 1/3 of the time, it's likely we aren't really seeing performance improvements as 1) people are using them for everything and b) we're still learning how to best use them.

So I expect over time we will see genuine performance improvements, but Amdahl's law dictates it won't be as much as some people and ceo's are expecting.

ansibsha16d ago

> better-specified software

Code is the most precise specification we have for interfacing with computers.

xp8415d ago

Sure, but if you define the code as the only spec, then it is usually a terrible spec, since the code itself specifies bugs too. And one of the benefits of having a spec (or tests) is that you have something against which to evaluate the program in order to decide if its behavior is correct or not.

Incidentally, I think in many scenarios, LLMs are pretty great at converting code to a spec and indeed spec to code (of equal quality to that of the input spec).

tmaly16d ago

There are some cases where AI is generating binary machine code, albeit small amounts. What do we have when we don't have the code?

1 more reply

interestpiqued15d ago

You’re missing the point of a spec

1 more reply

dboreham15d ago

Bingo. Hopefully there are some business opportunities for us in that truth.

_wire_16d ago

> because the time will just go into tests that wouldn't otherwise have been written

Writing tests to ensure a program is correct is the same problem as writing a correct program.

Evaluating conformance is a different category of concern from ensuring correctness. Tests are about conformance not correctness.

Ensuring correct programs is like cleaning in the sense that you can only push dirt around, you can't get rid of it.

You can push uncertainty around and but you can't eliminate it.

This is the point of Gödel's theorem. Shannon's information theory observes similar aspects for fidelity in communication.

As Douglas Adams noted: ultimately you've got to know where your towel is.

layer815d ago

A competent programmer proves the program he writes correct in his head. He can certainly make mistakes in that, but it’s very different from writing tests, because proofs abstract (or quantify) over all states and inputs, which tests cannot do.

shimman16d ago

These companies don't care about saving time or lowering operating costs, they have massive monopolies to subsidize their extremely poor engineering practices with. If the mandate is to force LLM usage or lose your job, you don't care about saving time; you care about saving your job.

One thing I hope we'll all collectively learn from this is how grossly incompetent the elite managerial class has become. They're destroying society because they don't know what to do outside of copying each other.

It has to end.

SchemaLoad15d ago

The submitter with their name on the Jira ticket saves time, the reviewer who has to actually verify the work loses a lot of time and likely just lets issues slip through.

marginalia_nu16d ago

To be honest, some times it's still beneficial.

For fairly straightforward changes it's probably a wash, but ironically enough it's often the trickier jobs where they can be beneficial as it will provide an ansatz that can be refined. It's also very good at tedious chores.

misnome16d ago

And spotting stuff in review! Sometimes it’s false positives but on several occasions I’ve spent ~15-30 minutes teaching-reviewing a PR in person, checked afterwards and it matched every one of the points.

bluGill16d ago

Some, but not very much. Writing code is hard. Ai will do a lot of tedious code that you procrastinate writing.

hard2416d ago

Also when you are writing code yourself you are implicitly checking it whilst at the back of your mind retaining some form of the entire system as a whole.

People seem to gloss over this... As a CEO if people don't function like this I'd be awake at night sweating.

bonesss16d ago

That’s the reverse-centaur issue I see: humans are not great at repetitive nuanced similar seeming tasks, putting the onus on humans to retroactively approve high volumes of critical code has them managing a critical failure mode at their weakest and worst. Automated reviews should be enhancing known good-faith code, manual reviews of high volume superficially sound but subversive code is begging for issues over time.

Which results the software engineering issue I’m not seeing addressed by the hype: bugs cost tens to hundreds of times their coding cost to resolve if they require internal or external communication to address. Even if everyone has been 10x’ed, the math still strongly favours not making mistakes in the first place.

An LLM workflow that yields 10x an engineer but psychopathically lies and sabotages client facing processes/resources once a quarter is likely a NNPP (net negative producing programmer), once opportunity and volatility costs are factored in.

2 more replies

bluGill16d ago

Sortof. I work on a system too large for anyone to know the whole thing. Often people who don't know each other do something that will break the other. (Often because of the number of different people - most individuals go years between this)

raw_anon_111116d ago

No I’m keeping up with the system as a whole because I’m always working at a system level when I’m using AI instead of worrying about the “how”

1 more reply

j / k navigate · click thread line to collapse

0 comments

happytoexplain16d ago

brandensilva16d ago

ritlo16d ago

We may end up stuck at "it's very-aggressive autocomplete" as far as LLMs' useful role in them, for most projects, indefinitely.

slopinthebag16d ago

So I expect over time we will see genuine performance improvements, but Amdahl's law dictates it won't be as much as some people and ceo's are expecting.

ansibsha16d ago

> better-specified software

Code is the most precise specification we have for interfacing with computers.

xp8415d ago

Incidentally, I think in many scenarios, LLMs are pretty great at converting code to a spec and indeed spec to code (of equal quality to that of the input spec).

tmaly16d ago

There are some cases where AI is generating binary machine code, albeit small amounts. What do we have when we don't have the code?

1 more reply

interestpiqued15d ago

You’re missing the point of a spec

1 more reply

dboreham15d ago

Bingo. Hopefully there are some business opportunities for us in that truth.

_wire_16d ago

> because the time will just go into tests that wouldn't otherwise have been written

Writing tests to ensure a program is correct is the same problem as writing a correct program.

Evaluating conformance is a different category of concern from ensuring correctness. Tests are about conformance not correctness.

Ensuring correct programs is like cleaning in the sense that you can only push dirt around, you can't get rid of it.

You can push uncertainty around and but you can't eliminate it.

This is the point of Gödel's theorem. Shannon's information theory observes similar aspects for fidelity in communication.

As Douglas Adams noted: ultimately you've got to know where your towel is.

layer815d ago

shimman16d ago

It has to end.

SchemaLoad15d ago

The submitter with their name on the Jira ticket saves time, the reviewer who has to actually verify the work loses a lot of time and likely just lets issues slip through.

marginalia_nu16d ago

To be honest, some times it's still beneficial.

misnome16d ago

bluGill16d ago

Some, but not very much. Writing code is hard. Ai will do a lot of tedious code that you procrastinate writing.

hard2416d ago

Also when you are writing code yourself you are implicitly checking it whilst at the back of your mind retaining some form of the entire system as a whole.

People seem to gloss over this... As a CEO if people don't function like this I'd be awake at night sweating.

bonesss16d ago

2 more replies

bluGill16d ago

raw_anon_111116d ago

No I’m keeping up with the system as a whole because I’m always working at a system level when I’m using AI instead of worrying about the “how”

1 more reply

j / k navigate · click thread line to collapse