Afaik vllm is for concurrent serving with batched inference for higher throughput, not single-user inference. I doubt inference throughput is higher with single prompts at a time than Ollama.
Update: this is a good Intro to continuous batching in llm inference:
https://www.anyscale.com/blog/continuous-batching-llm-infere...