> So why not simple quota counting?
Consider this: you are Anthropic. There are some Claude Code used cases that will have poor caching performance. Let's say these are 10% of your use cases.
You explicitly don't count cache misses right now because it would make the UX poor for these use cases. It's no big deal since the remaining 90% of use cases can subsidize the 10%.
Now open source clients become a thing. Instead of 10% of usage having poor caching, it grows to 50%. You can no longer subsidize those users because the economics don't work.
You have to start counting cache misses and the UX goes to shit for everyone.