I don’t see definitive evidence that there is some kind of Moore’s law for model improvement though. Just because this year’s model performs better than last year’s model doesn’t mean next year’s model will be another leap. Most of the big improvements this year seem to be around tooling - I still see Opus 4.6 (which is my daily driver at work) making lots of mistakes.