A lot of enterprises use Github Copilot which has per-request pricing model which effectively means unlimited tokens which eliminates this issue.
The harness receives a response, has to parse out the tool call, execute it and then start a new request with the tool call result.