undefined | Better HN

0 pointsmaccard3d ago0 comments

> The real magic of LLMs comes when they iterate until completion until the code compiles and the test passes, and you don't even bother looking at it until then.

If you read my post, you’d see that Claude code didn’t do that, I had to intervene in the agent loop and when I did it undid my fixes.

0 comments

maleldil2d ago

This is not compatible with many people's experiences. I use Python with a type checker. I tell Claude that the task is only complete once the type checker passes cleanly. It doesn't stop until there are no type errors. This should be even easier in a compiled language, especially if you also tell it to run the tests.

In fact, I find that with a strict feedback loop set up (i.e. a lot of lint rules, a strict type checker and fast unit tests), it will almost always generate what I want.

As someone else said, each step might be pretty stupid, but if you have a fast iteration loop, it can run until everything passes cleanly. My recommendation is to specify what really counts as "done" in your AGENTS.md/CLAUDE.md.

maccardOP2d ago

I tried again this morning - https://news.ycombinator.com/item?id=47487638 another hard failure that Claude code says passed.

j / k navigate · click thread line to collapse