Skip to content
Better HN
Top
Best
Ask
Show
New
Jobs
Search
⌘K
0 points
Jgrubb
11d ago
0 comments
Save
Share
The tokens are still being burnt, they're just doing so in a parallel dimension from the users main context window.
0 comments
4 comments · 2 top-level
top
newest
oldest
ajmurmann
11d ago
· 2 in thread
It's true that the initial tool response still has the same amount of tokens but it doesn't keep dragged along in the longer-lived top context.
knollimar
11d ago
Don't you resend after every turn, so splitting it avoids the n^2 token usage (granted it's cached so there's some optimal amount here)
ajmurmann
10d ago
Yes, exactly. You resend it on every turn (assuming no cache hits). This is why using the shorter-lived subagent to take in that context and only return the useful result back to the longer-lived context safes tokens.
ViewTrick1002
11d ago
The real benefit is being able to use a cheaper, but good enough, model with a specific system prompt dedicated to that task.
j
/
k
navigate · click thread line to collapse