undefined | Better HN

0 pointsMontyCarloHall1d ago0 comments

   I don’t prompt Claude anymore. I have loops running that prompt Claude and figuring out what to do. My job is to write loops.
   — Boris Cherny, head of Claude Code

Reliability is a direct reflection of the quality of the underlying infrastructural code. If even Anthropic, the company with the world's best agentic vibecoders, has horribly unreliable infrastructure, it really says something about the quality of the world's best agentically produced code.

0 comments

28 comments · 6 top-level

hombre_fatal1d ago· 9 in thread

Meh, this is the "must be the veganism" fallacy: if someone knows you're vegan, then any ailment you might have, no matter how ubiquitous in the population, must be somehow due to your vegan diet and no more details are required.

Except now it's the "AI did it" fallacy where if you know a company uses AI, even infra scaling issues must be due to AI, and if you had just used less or no AI, you would have been spared even though that has never been true.

The usual response to this goes something like "well they made claims that AI is good" therefore anything short of perfection supposedly debunks the claim.

gls2ro1d ago

This is not like that.

This is literally they saying they are letting their LLM run wild(ish) and seeing the status.claude.com we can see the result.

This is a case where the outcome is the direct result of the engineering practices like the ones they describe.

PS: Yes I use Claude, Coded, Amp and Cursor agents every day so I am not saying here LLMs are not valuable.

LE: They did not made claims that "AI is good" they made claims that developers/computer engineers are not needed anymore in the near future. Thats is a stronger claim and has a direct relation with a product they have which needs computer engineering (yes infra counts too) and which seems to be down more than we expect as a good quality bar.

brookst1d ago

You just said "it's not the 'must be veganism' thing, it's the 'must be veganism thing'"

Unless you have inside knowledge of their infra ops and management tools, it is just guessing and blaming veganism. For all we know it could be tools from Nvidia or anyone else failing under massive load.

It could be the veganism. Some things are. Leaping to it as the only possible explanation for every ailment is exactly the fallacy.

1 more reply

hombre_fatal1d ago

But it is like that. You have zero insight into the infrastructure issue. And the person quoted above is a Claude Code developer. So because this guy uses Claude generously to build Claude Code, then Anthropic's API scaling issues must necessarily be caused by his agent loops even though scaling issues plague every tech company, no less often pre-AI.

The issue is that it's a thought-terminating cliche, and it would be nice to have one place on the internet that isn't just who can post one the fastest with the most glee to the giddy seal-clapping of the audience.

1 more reply

MontyCarloHallOP1d ago

Another data point: GitHub is extremely insistent its employees maximally use AI for internal development [0], and we’ve concomitantly seen its reliability fall off a cliff in the last year or so.

[0] https://github.com/resources/insights/ai-powered-workforce-p...

1 more reply

svachalek1d ago

There’s a difference between having normal levels of difficulty and bad luck, and having people blame those on the wrong thing, vs having extraordinarily miserable quality and having people find the obvious difference. Potentially yes, they might have terrible wiring in their office or a crippling fondness for vim. But if I were their PR department I’d be talking about that if it was the problem.

trollbridge1d ago

If you go around bragging that you use AI for everything as part of your marketing plan, then don't be surprised that people blame you heavy AI usage when you have a problem.

tcp_handshaker1d ago

Ahem...

"Vegans and vegetarians may have higher stroke risk" - https://www.bbc.com/news/health-49579820

"Vegans had a 43% higher risk of fractures overall compared to nonvegetarians, as well as higher risks of hip, leg, and vertebral fractures." - https://sniglobal.org/plant-based-diets-and-fracture-risk/

"The Impact of a Vegan Diet on Many Aspects of Health: The Overlooked Side of Veganism" - https://www.cureus.com/articles/138315-the-impact-of-a-vegan...

"..people who followed a vegan diet had noticeably low levels of iodine in their bodies, an element that is essential for growth, bones, and brain function. In addition, vegans had lower bone health scores..." - https://www.bfr.bund.de/en/press-release/vegan-vegetarian-be...

SimianSci1d ago

There are a lot of nutritional blind spots in vegan diets. It is a diet that requires exceptional planning and intentionality to be at a baseline of health similar to a balanced omnivorous diet.

So indeed, the "it must be veganism" is not an unfounded concern when health complications arise, in a very similar way to "it must be the AI" is a valid concern when software issues arise.

hombre_fatal1d ago

This isn't really the place for this, nor does it matter to my analogy.

But I was more getting at, say, staying out of the sun or being skinnyfat as a vegan, and suddenly you look "sickly"/"frail" when you'd be given the grace of looking like most people otherwise.

A similar analogy would be someone saying "well, of course you do" if you have any malady while having been vaccinated. My point being to bring up the thought terminating cliche of it compared to doing the necessary further analysis to link the malady with the suspected cause.

---

> "Vegans and vegetarians may have higher stroke risk"

It was a lump vegetarian + vegan group with a weak CI bounded at 1.02 for 3/1000 cases over a decade. The same group also had a more robust benefit of less heart disease than meat eaters. The stroke outcomes aren't replicated in other cohorts either, afaik. But the heart disease benefits are.

> "Vegans had a 43% higher risk of fractures overall compared to nonvegetarians, as well as higher risks of hip, leg, and vertebral fractures."

The study used a single baseline questionnaire for 17+ years and looked at vegans with correctable nutrition deficiencies to see +15/1000 hip fractures over 10 years. I'll grant that a poorly planned diet, especially 30 years ago with less nutritional understanding, has worse health outcomes. Just like I wouldn't use the average American's diet to lambast an omnivore diet (compared to, say, the "Mediterranean" diet).

> "vegans had lower iodine, bone health scores" (RBVD study)

On bones: p=0.02 in 72 people with 5% less QUS score in their heel bone (not DXA nor bone density tested). No body weight mediation nor data about health outcomes like fractures, osteoporosis, and no time dimension since it was just a snapshot (cross-sectional).

On iodine: It's a surrogate biomarker from a single pee test. Study didn't look at iodine-related health outcomes like thyroid dysfunction, goiter, or clinical consequences.

---

dsmurrell1d ago· 6 in thread

I wonder how they fix things when Claude is down.

AlexB1381d ago

I would bet that they have inference setup for internal use on a separate system from the customer-facing production environment. The same way telemetry infrastructure needs to be run separate from normal production systems, so you aren't "blind" when you need it most.

wsatb1d ago

Based on this outage: not very well.

blensor1d ago

This is ( or will be in the future ) a surprisingly relevant issue

mysterydip1d ago

maybe they ask a secondary agentic system to fix it. will that be the future of “redundancy”?

rdtsc1d ago

"Gemini, fix my Claude infra"

qsxfthnkp23221d ago

lol probably use their dev or qa Claude environment to fix prod

brookst1d ago· 5 in thread

Is there any indication these errors are related to Anthropic-written code as opposed to operational issues from the fastest-growing infra buildout ever?

Layer-wise, the app is pretty far removed from request routing to GPU pools.

organsnyder1d ago

This is almost certainly a software issue, though. Even if it's due to scaling, they still built a system that failed catastrophically rather than degrading gracefully.

brookst1d ago

Sure. But could it be k8s config? Could it be Nvidia Bright Cluster? Could it be load balancing?

I'm not saying Anthropic isn't to blame for a system that is literally approaching one-nine uptime; they certainly are. I am saying that jumping to the "it must be vibe coding's fault" is an emotional confirmation-bias belief, not an evidence-based belief.

1 more reply

dpark1d ago

> failed catastrophically rather than degrading gracefully

You mean like returning 529s and operating with reduced QoS?

MontyCarloHallOP1d ago

Right. If this were truly a pure scaling issue, I’d expect the interface would offer an archive.is-esque “Claude is at capacity; your prompt is #XXX/YYY in the queue; estimated time remaining: ZZZ seconds”

Instead, the whole system just shits the bed, catastrophically.

2 more replies

Insanity1d ago

I'm not sure if that's really an Anthropic problem you're pointing to vs a problem that their infra layer handles (Amazon, Google, whatever hyperscaler). i.e, they might be scaling quickly but they are running on top of established infrastructure.

MattGaiser1d ago· 1 in thread

On the other hand we are also willing to buy it, so reliability is arguably not as valued a good as people assumed.

matltc1d ago

Some of us are unsubscribing, what with the coming face scans/enshittification/downtime/throttling...

rvz1d ago· 1 in thread

He is a salesman at this point and is not talking to you. He is talking to the investors who want to vibe code loops to waste tokens on building slop to get rid of you.

Goes to show how fake this industry has become when VC dollars have flooded it.

Somehow it is fine to vibe code infrastructure or security because someone (with a clear vested interest) wants you to spend more tokens at their casino because that is how they "win" at the casino (which they work at).

Except in reality, this part of software is critical and irresponsible to 'write loops" and we all know that he doesn't believe what he is saying.

nomel1d ago

It's very very clear they're eating their own dog food, in a product space built on tech that didn't really exist publicly 5 years ago, to the success of billions, that people increasingly depend on. Maybe I'm an optimist, but I can't fathom the intense negativity or perspective of failure here.

Don't use it. Maybe wait a few more years. If it's not valuable/useful, then not using it, while everything matures, will not be a problem.

TacticalCoder1d ago

> If even Anthropic, the company with the world's best agentic vibecoders...

But that's really not what they have. They have AI experts who are creating incredible LLMs.

Everything else is more than meh: Claude Code is really bad. Such a turd would never have gained any traction if it wasn't for the LLMs behind it.

I use LLMs to code daily (Claude Code still, mind you, for I didn't take the time to switch yet) and these modesl are both amazing and pathetic.

If you don't verify everything they output, they do the absolute craziest thing imaginable.

One example is I got an Anthropic model notice a "pattern" in range bound integer values. I had them range bound between, e.g., 0xCAFE0000 and 0xCAFEFFFF. And at some point a comparison/validation was needed and instead of doing an integer comparison the Anthropic model went ballistic: instead of doing an integer comparison it converted the numbers to a string, then started doing substring matching on "0xCAFE" and went even more "expert" by verifying at which position the match was happening. All that while explaining why it couldn't possibly fail.

Why did it do that? Very likely because, in a comment, it saw "0xCAFE..." as a string. And the thing saw a pattern.

Can you believe it? There's a pattern. So it must light up connections. We've got a pattern!

Now amount of kludge, hidden pre-processing, hidden post-processing is fixing the "quality" of the code produced by something that, instead of doing an integer comparison, converts things to string and then does substring searches and indexes computation.

There's no fixing that.

Yesterday: had to use three guard clauses before pushing data... Two of the three "logic gates" (as the model would explain they were, which is kinda right) he got right. The third one: same thing... It was planning to go ballistic, introduce countless lines of code, insane abstractions, to make a test that was solved with a one line timestamp comparison.

It's because it does things like that that the people who explain that they don't code anymore are delusional if they think this gives, as of today, quality code.

It's like that other dude who was happy to produce 37 K LOC per day and counting.

> ... it really says something about the quality of the world's best agentically produced code

Oh it is totally shit code. But if you monitor everything and vet everything they do, it's helpful.

I find these LLMs way more helpful at finding the source of bugs (not fixing them: finding them, which is 90% of the job anyway) and at acting like rubber-ducks then at writing code.

Claude Code sucks. Claude Code CLI sucks. Their only "solutions" to all problems is to create VMs, headless browsers, and resort to incredible hacks (the infamous "game loop" that modifies the characters output by the LLM is just shameful) etc. to try to hide the misery. It's miserable kludges everywhere.

And the only reason these miserable kludges are not entirely falling apart is because they rest on the shoulders of actual giants: projects like Linux, QEMU, etc. that were not vibe-coded.

It's sad to have useful tools (the models) and to make such poor use of them.

I'm pretty sure that, in the end, it's just like open-source powering the entire world by now: we'll have open-source projects like Pi and then newer ones that are going to come out and fix the mess we have now. And they're not going to be 100% vibe-coded by people whose jobs is "to write loops".

1 more reply

j / k navigate · click thread line to collapse

0 comments

28 comments · 6 top-level

hombre_fatal1d ago· 9 in thread

The usual response to this goes something like "well they made claims that AI is good" therefore anything short of perfection supposedly debunks the claim.

gls2ro1d ago

This is not like that.

This is literally they saying they are letting their LLM run wild(ish) and seeing the status.claude.com we can see the result.

This is a case where the outcome is the direct result of the engineering practices like the ones they describe.

PS: Yes I use Claude, Coded, Amp and Cursor agents every day so I am not saying here LLMs are not valuable.

brookst1d ago

You just said "it's not the 'must be veganism' thing, it's the 'must be veganism thing'"

It could be the veganism. Some things are. Leaping to it as the only possible explanation for every ailment is exactly the fallacy.

1 more reply

hombre_fatal1d ago

1 more reply

MontyCarloHallOP1d ago

Another data point: GitHub is extremely insistent its employees maximally use AI for internal development [0], and we’ve concomitantly seen its reliability fall off a cliff in the last year or so.

[0] https://github.com/resources/insights/ai-powered-workforce-p...

1 more reply

svachalek1d ago

trollbridge1d ago

If you go around bragging that you use AI for everything as part of your marketing plan, then don't be surprised that people blame you heavy AI usage when you have a problem.

tcp_handshaker1d ago

Ahem...

"Vegans and vegetarians may have higher stroke risk" - https://www.bbc.com/news/health-49579820

"The Impact of a Vegan Diet on Many Aspects of Health: The Overlooked Side of Veganism" - https://www.cureus.com/articles/138315-the-impact-of-a-vegan...

SimianSci1d ago

There are a lot of nutritional blind spots in vegan diets. It is a diet that requires exceptional planning and intentionality to be at a baseline of health similar to a balanced omnivorous diet.

So indeed, the "it must be veganism" is not an unfounded concern when health complications arise, in a very similar way to "it must be the AI" is a valid concern when software issues arise.

hombre_fatal1d ago

This isn't really the place for this, nor does it matter to my analogy.

But I was more getting at, say, staying out of the sun or being skinnyfat as a vegan, and suddenly you look "sickly"/"frail" when you'd be given the grace of looking like most people otherwise.

---

> "Vegans and vegetarians may have higher stroke risk"

> "Vegans had a 43% higher risk of fractures overall compared to nonvegetarians, as well as higher risks of hip, leg, and vertebral fractures."

> "vegans had lower iodine, bone health scores" (RBVD study)

On iodine: It's a surrogate biomarker from a single pee test. Study didn't look at iodine-related health outcomes like thyroid dysfunction, goiter, or clinical consequences.

---

dsmurrell1d ago· 6 in thread

I wonder how they fix things when Claude is down.

AlexB1381d ago

wsatb1d ago

Based on this outage: not very well.

blensor1d ago

This is ( or will be in the future ) a surprisingly relevant issue

mysterydip1d ago

maybe they ask a secondary agentic system to fix it. will that be the future of “redundancy”?

rdtsc1d ago

"Gemini, fix my Claude infra"

qsxfthnkp23221d ago

lol probably use their dev or qa Claude environment to fix prod

brookst1d ago· 5 in thread

Is there any indication these errors are related to Anthropic-written code as opposed to operational issues from the fastest-growing infra buildout ever?

Layer-wise, the app is pretty far removed from request routing to GPU pools.

organsnyder1d ago

This is almost certainly a software issue, though. Even if it's due to scaling, they still built a system that failed catastrophically rather than degrading gracefully.

brookst1d ago

Sure. But could it be k8s config? Could it be Nvidia Bright Cluster? Could it be load balancing?

1 more reply

dpark1d ago

> failed catastrophically rather than degrading gracefully

You mean like returning 529s and operating with reduced QoS?

MontyCarloHallOP1d ago

Instead, the whole system just shits the bed, catastrophically.

2 more replies

Insanity1d ago

MattGaiser1d ago· 1 in thread

On the other hand we are also willing to buy it, so reliability is arguably not as valued a good as people assumed.

matltc1d ago

Some of us are unsubscribing, what with the coming face scans/enshittification/downtime/throttling...

rvz1d ago· 1 in thread

He is a salesman at this point and is not talking to you. He is talking to the investors who want to vibe code loops to waste tokens on building slop to get rid of you.

Goes to show how fake this industry has become when VC dollars have flooded it.

Except in reality, this part of software is critical and irresponsible to 'write loops" and we all know that he doesn't believe what he is saying.

nomel1d ago

Don't use it. Maybe wait a few more years. If it's not valuable/useful, then not using it, while everything matures, will not be a problem.

TacticalCoder1d ago

> If even Anthropic, the company with the world's best agentic vibecoders...

But that's really not what they have. They have AI experts who are creating incredible LLMs.

Everything else is more than meh: Claude Code is really bad. Such a turd would never have gained any traction if it wasn't for the LLMs behind it.

I use LLMs to code daily (Claude Code still, mind you, for I didn't take the time to switch yet) and these modesl are both amazing and pathetic.

If you don't verify everything they output, they do the absolute craziest thing imaginable.

Why did it do that? Very likely because, in a comment, it saw "0xCAFE..." as a string. And the thing saw a pattern.

Can you believe it? There's a pattern. So it must light up connections. We've got a pattern!

There's no fixing that.

It's because it does things like that that the people who explain that they don't code anymore are delusional if they think this gives, as of today, quality code.

It's like that other dude who was happy to produce 37 K LOC per day and counting.

> ... it really says something about the quality of the world's best agentically produced code

Oh it is totally shit code. But if you monitor everything and vet everything they do, it's helpful.

I find these LLMs way more helpful at finding the source of bugs (not fixing them: finding them, which is 90% of the job anyway) and at acting like rubber-ducks then at writing code.

And the only reason these miserable kludges are not entirely falling apart is because they rest on the shoulders of actual giants: projects like Linux, QEMU, etc. that were not vibe-coded.

It's sad to have useful tools (the models) and to make such poor use of them.

1 more reply

j / k navigate · click thread line to collapse