undefined | Better HN

0 pointswolttam1mo ago0 comments

It depends on the use-case. yes, 90% of cost is cache in agentic coding scenarios (actually 95% in my experience). But not when the model reasons for 200k+ tokens before answering a complex problem.

0 comments

3 comments · 1 top-level

himata41131mo ago· 2 in thread

gemini models solve a problem in 80% less tokens so that's something to think about.

johaugum1mo ago

Source?

himata41131mo ago

https://help.kagi.com/kagi/ai/llm-benchmark.html

j / k navigate · click thread line to collapse