I assumed because I’m on paid tiers it would still cost behind a certain usage amount, but I guess not.
I’m using Claude Pro for daily driver and Gemini / ChatGPT free tiers.
Not on ai studio.
Claude Sonnet 4 now supports 1M tokens of context - https://news.ycombinator.com/item?id=44878147 - Aug 2025 (160 comments)
It isnt clear from the article whether the time they quote is time-to-first-token or time to completion. If it is latter, then it makes sense why gemini* would take longer even with similar token throughput.
https://ghostarchive.org/archive/JlE5T
https://web.archive.org/web/20250812172455/https://every.to/...
Gemini has done this in ways that I haven't seen in the recent or current generation models from OpenAI or Anthropic.
It really surprised me that Gemini performs so well in multi-turn benchmarks, given that tendency.
[1]: https://www.imdb.com/title/tt0766092/quotes/?item=qt1440870