In my own testing these models sill have a different flavor to them
- Opus 4.5 for software development. Works faster, and tends to write cleaner code.
- GPT 5.2 xHigh for mathematical analysis, and analysis in general (e.g. code review, planning, double checks), it's very meticulous.
- Gemini 3.0 Pro for image understanding, though this one I haven't played around with much.