* make sure the model maxes out all benchmarks
* release it
* after some time, nerf it
* repeat the same with the next model
However, the net sum is positive: in general, models from 2026 are better than those from 2024.
I never asked for a 1M context window, then I got it and it was nice, now it's as if it was gone again .. no biggie but if they had advertised it as a free-trial (which it feels like) I wouldn't have opted in.
Anyways, seems I'm just ranting, I still like Claude, yes but nonetheless it still feels like the game you described above.
https://x.com/lydiahallie/status/2039800718371307603
--- start quote ---
Digging into reports, most of the fastest burn came down to a few token-heavy patterns. Some tips:
• Sonnet 4.6 is the better default on Pro. Opus burns roughly twice as fast. Switch at session start.
• Lower the effort level or turn off extended thinking when you don't need deep reasoning. Switch at session start.
• Start fresh instead of resuming large sessions that have been idle ~1h
• Cap your context window, long sessions cost more CLAUDE_CODE_AUTO_COMPACT_WINDOW=200000
--- end quote ---
https://x.com/bcherny/status/2043163965648515234
--- start quote ---
We defaulted to medium [reasoning] as a result of user feedback about Claude using too many tokens. When we made the change, we (1) included it in the changelog and (2) showed a dialog when you opened Claude Code so you could choose to opt out. Literally nothing sneaky about it — this was us addressing user feedback in an obvious and explicit way.
--- end quote ---
Sometimes you have to keep starting new session until it works. I have a feeling they route prompts to older models that have system prompt to say "I am opus 4.6", but really it's something older and more basic. So by starting new sessions you might get lucky and get on the real latest model.