The music/video models are cool, but It's an apples to orange comparison with GPT-4. I don't think there's really any comparison of intelligence or "advanceness" between those models and GPT-4.
I'm surprised to hear someone say that O1 and new Sonnet are "leaps", though. My impression of them is that they're qualitatively similar to GPT-4. Incremental improvements at best. I don't think the gap between GPT-4 and the new Sonnet is anywhere near as large as the gap between GPT-3 and GPT-4, for instance.