undefined | Better HN

0 pointsmeowface4mo ago0 comments

I think I and many others have found Sonnet 4.5 to generally be better than Sonnet 4 for coding.

0 comments

Maybe if you confirm to its expectations for how you use it. 4.5 is absolutely terrible for following directions, thinks it knows better than you, and will gaslight you until specifically called out on its mistake.

I have scripted prompts for long duration automated coding workflows of the fire and forget, issue description -> pull request variety. Sonnet 4 does better than you’d expect: it generates high quality mergable code about half the time. Sonnet 4.5 fails literally every time.

pawelduda4mo ago

I'm very happy with it TBH, it has some things that annoy me a little bit:

- slower compared to other models that will also do the job just fine (but excels at more complex tasks),

- it's very insistent on creating loads of .MD files with overly verbose documentation on what it just did (not really what I ask it to do),

- it actually deleted a file twice and went "oops, I accidentaly deleted the file, let me see if I can restore it!", I haven't seen this happen with any other agent. The task wasn't even remotely about removing anything

adastra224mo ago

The last point is how it usually fails in my testing, fwiw. It usually ends up borking something up, and rather than back out and fix it, it does a 'git restore' on the file - wiping out thousands of lines of unrelated, unstaged code. It then somehow thinks it can recover this code by looking in the git history (??).

And yes, I have hooks to disable 'git reset', 'git checkout', etc., and warn the model not to use these commands and why. So it writes them to a bash script and calls that to circumvent the hook, successfully shooting itself in the foot.

Sonnet 4.5 will not follow directions. Because of this, you can't prevent it like you could with earlier models from doing something that destroys the worktree state. For longer-running tasks the probability of it doing this at some point approaches 100%.

1 more reply

meowfaceOP4mo ago

I think this is probably just a matter of noise. That's not been my experience with Sonnet 4.5 too often.

Every model from every provider at every version I've used has intermingled brilliant perfect instruction-following and weird mistaken divergence.

adastra224mo ago

What do you mean by noise?

In this case I can't get 4.5 to follow directions. Neither can anyone else, aparantly. Search for "Sonnet 4.5 follow instructions" and you'll find plenty of examples. The current top 2:

https://www.reddit.com/r/ClaudeCode/comments/1nu1o17/45_47_5...

https://theagentarchitect.substack.com/p/claude-sonnet-4-pro...

j / k navigate · click thread line to collapse

0 comments

adastra224mo ago

pawelduda4mo ago

I'm very happy with it TBH, it has some things that annoy me a little bit:

- slower compared to other models that will also do the job just fine (but excels at more complex tasks),

- it's very insistent on creating loads of .MD files with overly verbose documentation on what it just did (not really what I ask it to do),

adastra224mo ago

1 more reply

meowfaceOP4mo ago

I think this is probably just a matter of noise. That's not been my experience with Sonnet 4.5 too often.

Every model from every provider at every version I've used has intermingled brilliant perfect instruction-following and weird mistaken divergence.

adastra224mo ago

What do you mean by noise?

In this case I can't get 4.5 to follow directions. Neither can anyone else, aparantly. Search for "Sonnet 4.5 follow instructions" and you'll find plenty of examples. The current top 2:

https://www.reddit.com/r/ClaudeCode/comments/1nu1o17/45_47_5...

https://theagentarchitect.substack.com/p/claude-sonnet-4-pro...

j / k navigate · click thread line to collapse