undefined | Better HN

story

0 pointssarchertech1mo ago0 comments

How do you know whether the tests it’s spits out are bad if you don’t read the tests.

We’re not dealing AGI here. Tests aren’t strictly necessary for humans. They are for AI. AI requires guardrails to keep from spinning out. That’s essentially the entire premise of the agentic workflow.

0 comments

thunky1mo ago

> How do you know whether the tests it’s spits out are bad if you don’t read the tests.

I do read the tests (quickly, I admit) and so does OP:

Architecture overview sure, and testing yes, but not reading the code directly any more.

Reading that again I may have misunderstood what they meant by "testing yes", though.

sarchertechOP1mo ago

I’m pretty sure they just meant they do testing not that they read the tests and that’s what everyone else who responded interpreted that as well.

You can get Claude to write good tests but based on what I’m seeing at work that’s not what’s happening. They always look plausible even when they’re wrong, so people either don’t read them, skim them very quickly, or read the first few assume the rest work and commit.

I think Claude is great for testing because setting test data and infrastructure is such a boring slog. But it almost always takes a lot of back and forth and careful handholding to get it right.

mlazos1mo ago

I read the tests, it also is really really good to have Claude verify that removing the changes in question break the tests. This brings the quality way way up for me.

j / k navigate · click thread line to collapse