undefined | Better HN

0 pointswinrid2mo ago0 comments

It's bad at long running tasks.

0 comments

3 comments · 1 top-level

bluegatty2mo ago· 2 in thread

Yes and no. It's bad because of shorter context but it does have auto-compaction which was much better than Claude. If you provide it documentation to work from and re-reference, it works long-running.

Honestly - 'every inch of IQ delta' seems to be worth it over anything else.

I'm a long time Claude Code supporter - and I'm ashamed to admit how instantly I dropped it when discovering how much better 5.4 is.

I don't trust Claude anymore for anything that requires heavy thinking - Codex always finds flaws in the logic.

But this happens every few months.

winridOP2mo ago

I tried to use 5.4 for something pretty straightforward - create scripts to automate navigating a game UI and capturing the network traffic. 5.4 was super frustrating, constantly stopping and waiting for feedback etc, even after telling it to never wait and just iterate/debug. I quit and switched to Opus 4.6 and it did much more of the work by itself.

bluegatty2mo ago

I've never run into that problem, but these were coding solutions in codex with a strong plan, steps to work towards.

It could be that if you're using massive tokens on a 'plan' then then want to limit u in a way, or even if the objective is not perfectly clear they don't want semi-random token use.

See if the token/sub solution behaves differently. Make sure that when it 'compacts' that it re-reads your instructions clearly.

1 more reply

j / k navigate · click thread line to collapse