I doubt it needs to be more than 20-50kloc.
You can create a full 3D game with a custom 3D engine in 500k lines. What the hell is Claude Code doing?
Engineers using LOC as a measure of quality is the inverse of managers using LOC as a measure of productivity.
The principles of good software don't suddenly vanish just because now it's a machine writing the code instead of a human, they still have to deal with the issues humans have for more than half a century. The history of programming is new developers coming up with a new paradigm, then rediscovering all the issues that the previous generation had figured out before them.
It turns out that there is a tradeoff in code between velocity and quality that smart businesses consider relative to hardware cost/quality. The businesses that are outcompeting others are rarely those who have the highest quality code, but rather those that are shipping quickly at a quality level that is satisfactory for current hardware.
But given that we know the functionality of Claude Code, we can guess how much complexity should be required. We could also be wrong.
>Why does it matter?
If there’s massively more code than there needs to be that does matter to the end user because it’s harder to maintain and has more surface area for bugs and security problems. Even with agents.
The more lines of code you have the more likely there is for one of them to be wrong and go unnoticed. It results in bugs, vulnerabilities,... and leaks.
Because it's unmaintainable slop that they themselves don't know how to fix when something happens? https://news.ycombinator.com/item?id=47598488
At some point someone will probably take their LLM code and repoint it at the LLM and say 'hey lets refactor this so it uses less code is easier to read but does the same thing' and let it chrun.
One project I worked on I saw one engineer delete 20k lines of code one day. He replaced it with a few lines of stored procedure. That 20k lines of code was in production for years. No one wanted to do anything with it but it was a crucial part of the way the thing worked. It just takes someone going 'hey this isnt right' and sit down and fix it.
Sure. You could have. But you're not the one playing football in the Champions League.
There were many roads that could have gotten you to the Champions League. But now you're in no position to judge the people who got there in the end and how they did it.
Or you can, but whatever.
The only reason people are using Claude Code is because it's the only way to use their (heavily subsidized) subscription plans. People who are okay with using and paying for their APIs often opt out for other, better, tools.
Also, analogies don't work. As we know for a fact that Claude Code is a bloated mess that these "champions league-level engineers" can't fix. They literally talk about it themselves: https://news.ycombinator.com/item?id=47598488 (they had to bring in actual Champions League engineers from bun to fix some of their mess).
You just repeat the same statement.
That bloated mess is what got them to the Champions League. They did what was necessary to get them here. And they succeeded so far.
But hey, according to some it can be replicated in 50k lines of wrapper code around a terminal command, so for Anthropic it's just one afternoon of vibe coding to get rid of this mess. So what's the problem? /s
What every developer learns during their “psh i could build that” weekendware attempt is that there is infinite polish to be had, and that their 20k loc PoC was <1% of the work.
That said, doesn't TFA show you what they use their loc for?
This file is exactly what I'm talking about.
Take the loadInitialMessage function: It's encumbered with real world incremental requirements. You can see exactly the bolted-on conditionals where they added features like --teleport, --fork-session, etc.
The runHeadlessStreaming function is a more extreme version of that where a bunch of incremental, lateral subsystems are wired together, not an example of superfluous loc.
It's a completely different domain, e.g. very different integration surface area and abstractions.
Claude Code's source is dumped online so there's probably a more concrete analysis to be had than "that sounds like too many loc".
For example, I found an implementation of a PRNG, mulberry32 [1], in one of the files. That's pretty strange considering TS and Javascript have decent PRNGs built into the language and this thing is being used as literally just a shuffle.
[1] https://github.com/AprilNEA/claude-code-source/blob/main/src...
If you search mulberry32 in the code, you'll see they use it for a deterministic random. They use your user ID to always pick the same random buddy. Just like you might use someone's user ID to always generate the same random avatar.
So that's 10 lines of code accounted for. Any other examples?