If I had to collapse the nature of the difference in one sentence it'd be that the 5.5 does more what I'm asking it to do versus doing a small aspect of what I'm asking then stopping.
5.4 required a lot of "continue" encouragement. 5.5 just "gets it" a bit more
What is boils down to for me is that even though it's more expensive I would much rather use 5.5 on low then 5.4/5.3 on high/medium
It is unable to do K.I.S.S . Instead of adding just an endpoint, it creates a service, middleware, config reader and finally an endpoint.
LLMs are nowhere near being good developers. The only thing they have is speed. Because of this speed they create the illusion of a good developer, the whoa moment. Whoa it would've taken me 2 months to implement this. Yeah but then again you would not make such silly mistakes and you would've reused that oidc client instead of reinventing the wheel every single time.
I would say models entered a bottleneck a long time ago. My personal opinion is now they are overfitting newer models on coding and "agentic" capabilities at great expense of general abilities in other domains.
Still amazing, but 5.5 does feel like incremental progress with a massive up charge.
The reality is both Anthropic and OAI have converged on LLMs as being a thing for software production - that's where the majority of their revenue is coming from.
IE. They had 100 compute units. Demand is 200 units. They have to do a combination of buying more compute, increasing price, lowering limits, etc.
Please stop. Critical theory is easy. Something about “X” sucks. Got it. What is the alternative? It’s the completely unserious philosophy of the peanut gallery.
If that is true then they should all invest resources into projects that will yield efficient use of the compute. The most efficient producer then gains a huge cost advantage AND capacity to serve more… so yeah.. that logic doesn’t hold.
This doesn't always mean that there is a bottleneck in terms of raw power, it may also mean that your use cases (or the lower hanging fruits among them) are already covered.
We'll probably see another stair step change followed by another plateauing curve of incremental improvements when that happens.
Some releases are just "meh", but I wouldn't rule out exciting new stuff for 2026 just because Opus 4.7 sucked.
So quickly - this industry has had trillions thrown around to get here so quickly, heh.
But, yes, capability seems somewhat stagnant. It's about ISO perf and cost improvements or iso cost and perf improvements + agentic.