undefined | Better HN

0 pointshedora1mo ago0 comments

I tried testing 4.5 opus and 4.6 opus both with “high” thinking. Same box, same repo. I had them plan a moderate complexity refactoring on a small codebase.

Observations:

4.6 had previously failed to the point where I had to wipe context. It must have written memories because it was referring to the previous conversation.

As the article points out, 4.6 went out of its way to be lazy and came up with an unusable plan. It did extra planning to avoid renaming files (the toplevel task description involves reorganizing directories of files).

4.6 took twice as long to respond as 4.5.

I’m treating this as a model regression. 4.6 is borderline unusable. I’ve hit all the issues the article describes.

Also, there needs to be an obvious way to disable memory or something. The current UX is terrible, since once an error or incorrect refusal propagates, there is no obvious recovery path.

Anyway, with think set to high, I see drastically different behavior: much slower and much worse output from 4.6.

0 comments

justinclift1mo ago

> Also, there needs to be an obvious way to disable memory or something.

Memory files are stored in a path under ~/.claude somewhere. It's fairly easy to find (I'm just not typing this on a PC with Claude on it atm), and from memory (heh) it's in Markdown.

If you nuke the memory file(s) then you should be good. Oh, I think the memory files are project or directory scoped from memory (heh again) too, so you should be able to keep/remove things manually without losing important stuff if you want.

> Anyway, with think set to high, I see drastically different behavior: much slower and much worse output from 4.6.

Might be worth trying the CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING setting then?

j / k navigate · click thread line to collapse