Think about ANY other product and what you'd expect from the competition thats half the price. Yet people here act like Gemini is dead weight
____
Update:
3.1 was 40% of the cost to run AA index vs Opus Thinking AND SONNET, beat Opus, and still 30% faster for output speed.
https://artificialanalysis.ai/?speed=intelligence-vs-speed&m...
But man, people are really avid about it being an awful model.
You'd notice how good Opus is in Claude Code. IMHO CC is the secret sauce
The harness is just much better on the Anthropic side.
Like files I didn't mention being edited and read and stuff of that nature. Sometimes this is cute in fixing typos in docs but when its changing things where it clearly doesn't even understand the intentionality behind something it's annoying.
Gemini 3.1 is clearly much better when trying it today. It stayed focused and found its way around without getting distracted.
If you told people Gemini 3.1 was Claude 4.7, they'd be going nuts singing its praises.
So a lot of these things are relative.
Now if that equation plays out 20K times a day, well that's one thing, but if it's 'once a day' then the cost basis becomes irrelevant. Like the cost of staplers for the Medical Device company.
Obviously it will matter, but for development ... it's probably worth it to pay $300/mo for the best model, when the second best is $0.
For consumer AI, the math will be different ... and that will be a big deal in the long run.
I think Gemini gives fine answers outside code tasks.
Outside of work, where I use Claude, Gemini is cheaper for me (for what I would use AI for) than both Claude and ChatGPT so Google gets my money.
But Gemini is also a great answer (possibly slightly less great or more great).
When consumers cannot easily assess a product's quality, they frequently use price as a primary indicator, equating higher costs with superior quality.
Quantity is OpenAi's.
Google's is... specialized hardware? (For now.)
Also deeper crawls, and Google Books! (Though it's unclear if they're making good use of those.)
Counterpoint: price will matter before we hit AGI
When I play with it in 'temporary chat' mode that ignores past chats and personal context directives, the responses are the typical slop littered with emojis, worthless lists, and platitudes/sycophancy. It's as jarring as turning off your adblocker and seeing the garish ad trash everywhere.
There are 4 models, all receiving the exact same prompts a few times a day, required to respond with a specific action.
In the first experiment I used gemini-3-pro-preview, it spent ~$18 on the same task where Opus 4.5 spent ~$4, GPT-5.1 spent ~$4.50, and Grok spent ~$7. Pro was burning through money so fast I switched to gemini-3-flash-preview, and it's still outspending every other model on identical prompts. The new experiment is showing the same pattern.
Most of the cost appears to be reasoning tokens.
The takeaway here is: Gemini spends significantly more on reasoning tokens to produce lower quality answers, while Opus thinks less and delivers better results. The per-token price being lower doesn't matter much when the model needs 4x the tokens to get there.
Opus: 521k input tokens; 12k out
Grok: 443k input tokens; 57k out
Gemini: 677k input tokens; 7k out
OAI: 543k input tokens; 17k out
Gemini appears to use by far the least amount of reasoning tokens, assuming they're included in the output counts.
Google undercutting/subsidizing it's own prices to bite into Anthropic's market share (whilst selling at a loss) doesn't automatically mean Google is effective.
But Flash is 1/8 the cost of sonnet and its not impressive?
> Think about ANY other product and what you'd expect from the competition thats half the price.
Car, fashion, jewelry, earphone, furniture, keyboard, mouse, restaurant, house,...
Most things aren't worth commenting on except the gemini posts here, which I find insane.
And pretty much every example you gave Id expect quite a lot more for 2x the amount? Idk man
Gemini definitely has its merits but for me it just doesn't do what other models can. I vibe-coded an app which recommends me restaurants. The app uses gemini API to make restaurants given bunch of data and prompt.
App itself is vibe-coded with Opus. Gemini didn't cut it.
Opus is absurdly good in Claude code but theres a lot of use cases Gemini is great at.
I think Google is further behind with the harness than the model
But I agree: If they can get there (at one point in the past year I felt they were the best choice for agentic coding), their pricing is very interesting. I am optimistic that it would not require them to go up to Opus pricing.
Skill issue, maybe, but I can't get gemini to do any nontrivial tasks reliably, and it's difficult to have it do trivial tasks without getting distracted and making unrelated changes that eat my time and mental energy to think about.
The breakthrough advance of Opus 4.5 over 4.1 wasn't so much an intelligence jump, but a jump in discerning scope and intent behind user queries.
Is it? Honestly, I still chuckle about black Nazis and the female Indian Popes. That was my first impression of Gemini, and first impressions are hard to break. I used Gemini’s VL (vision) for something and it refused to describe because it assumed it was NSFW imagery, which is was not.
I also question statis as an obvious follow up. Is Gemini equal to Opus? Today? Tomorrow? Has Google led the industry thus far and do I expect them to continue?
Counterpoint to that would be that with natural language input and output, that LLM specific tooling is rare and it is easy to switch around if you commoditize the product backend.
EDIT: Gemini does have 1m context for "free" though so that's great.