See the link in my post. It asks you to run the tool. You run the tool and tell it the result... And then it uses the result of the tool to decide to reply to the user.
The link talks about tools that 'lie' - ie. a calculator which deliberately tries to trick GPT-4 into giving the wrong answer. It turns out that GPT-4 only trusts the tools to a certain extent - if the answer the tool gives is too unbelievable, then GPT-4 will either re-run the tool or give a hallucinated answer instead.