You are right, I was mistaken about the version. I evaluated it in general chat assistant prompts plucked from my history across a range of topics but did not use it for coding - there was never a time when I thought 4o was “good enough” for agentic coding.