Time between request translates directly into response time as someone is waiting during that time aren’t they? If nobody is waiting and it’s not added to anyone’s wait time then who cares?
The answer really is power consumption during GC even if nobody is waiting, but you didn’t mention that.
If you only take 10ms to process a request and give the data back to the user, but then take 200ms afterwards cleaning up after yourself (garbage collection, background tasks etc). then you can only serve up less than 5 requests per second per worker. If you allocate per request 200mb and then free it afterwards and only have 1GB of memory on the server, you can only have a max of 5 workers, so in this case you can only have ~25 requests per second throughput. Fixing either of those cases means you can have a higher throughput in the end without having to scale it out across more servers, since you can prevent the future requests from waiting by either being able to have more workers, or have workers do less work between requests. This isn't even necessarily GC work, it could be sending off jobs to send emails or other jobs that were related to the request regardless of what they were. All of this still ties up the worker that could be handling the request.
Also, the number of requests you get has nothing to do with the number of requests you can actually process. You could be able to process 100k requests per second, but only get 200/second, or vice versa.