There are two ways to call out to an externally hosted LLM via an HTTP API:
1. A blocking call, which can take 3-10 seconds.
2. A streaming call, which can also take 3-10 seconds but where the content is streaming directly to you as it is generated (and you may be proxying it through to your user).
In both cases you risk blocking a thread or process for several seconds. That's the problem asyncio solves for you - it means you could have hundreds (or even thousands) of users all waiting for the response to that LLM call without needing hundreds or thousands of blocked threads/processes.