I strongly suspect most closed source code developed under commercial or internal pressure is pretty awful after a few years of development.
All LLM code has to do is suck less than existing code. And that's presuming the code quality doesn't improve as the models, the harnesses and our ways of working with them improve.
LLMs doesn't have this benefit. You forget to add the correct to the system prompt, and the LLM will repeat the same mistake over and over, and worse than that, their mistakes aren't based on their understanding, it's basically random guesses.
Humans, even bad coders, still seem to have some sort of architecture in mind, even if it's spaghetti, whereas LLMs (obviously) don't think more than a few steps, and never about the full scope of what they're contributing too, and on purpose too, because you want the context to be as small as possible when you work with LLMs.
With LLMs you need to thread carefully between "What does the LLM need to know?" and "Can I skip passing this to the LLM this time?" while a human you can more or less dump them everything you sit on, and let them shift it through, and they'll mostly make it out OK.
Whilst I don't claim any true "understanding" as that is a very loaded term that doesn't mean it's just random guesses.
Anyone using recent LLM coding agents on a regular basis would probably agree that there's something going on that fits some non-athropomorphizing, non-sentience-assigning definition of "understanding"
As for the point about improvement - I think that's an orthogonal issue to the overall code quality. With regard to human codebases - there's plenty of scenarios that negate the improvement of individuals. We're comparing organizations with LLMs - not individuals with LLMs and that makes a significant difference.
Not random across their whole training set. Random across related concepts bundled together in the training set. Which is not that dissimilar to human mistakes.
A human's mistakes are also based on going from one option in their training and not another, where the two are close together but one is not appropriate and doesn't fully cover the expected result.
That's obvious in a typo (you get close to the target word but miss it just so), but also in off by one errors (you're still in the proximity of the correct loop you should have written), all the way to picking the wrong architecture or patter n (you still chose among patterns for the worse fit you've picked, you don't suddenly start using cooking recipes).
i dont see why software engineers are paid so well, and are so hard to hire?
just dump a bunch of requirements on a homeless person and itll just work out
But anyway, let the LLM verify the code to give advice on improvements but don't let it write code unverified. That's my opinion on it anyway.