Honestly - 'every inch of IQ delta' seems to be worth it over anything else.
I'm a long time Claude Code supporter - and I'm ashamed to admit how instantly I dropped it when discovering how much better 5.4 is.
I don't trust Claude anymore for anything that requires heavy thinking - Codex always finds flaws in the logic.
But this happens every few months.
It could be that if you're using massive tokens on a 'plan' then then want to limit u in a way, or even if the objective is not perfectly clear they don't want semi-random token use.
See if the token/sub solution behaves differently. Make sure that when it 'compacts' that it re-reads your instructions clearly.