undefined | Better HN

0 pointschrisseaton7y ago0 comments

How can you increase throughout without reducing response time?

0 comments

3 comments · 1 top-level

simcop23877y ago· 2 in thread

Throughput is going to be affected by anything that can bottleneck your application. Even if response time is the same, if you reduce the time spent cleaning up between responses and requests then you can handle a higher number of requests with the same number of workers. If you reduce the amount of memory being allocated you'd also be able to run more workers on the same hardware, also increasing throughput. And as you're implying, reducing time spent making responses would also allow you to increase throughput too.

chrisseatonOP7y ago

I still don’t understand that maths - if you both get n requests per second and the time per request has not changed then the throughout is the same isn’t it?

Time between request translates directly into response time as someone is waiting during that time aren’t they? If nobody is waiting and it’s not added to anyone’s wait time then who cares?

The answer really is power consumption during GC even if nobody is waiting, but you didn’t mention that.

simcop23877y ago

Throughput involves both time to process a request and how many requests you can process per unit time.

If you only take 10ms to process a request and give the data back to the user, but then take 200ms afterwards cleaning up after yourself (garbage collection, background tasks etc). then you can only serve up less than 5 requests per second per worker. If you allocate per request 200mb and then free it afterwards and only have 1GB of memory on the server, you can only have a max of 5 workers, so in this case you can only have ~25 requests per second throughput. Fixing either of those cases means you can have a higher throughput in the end without having to scale it out across more servers, since you can prevent the future requests from waiting by either being able to have more workers, or have workers do less work between requests. This isn't even necessarily GC work, it could be sending off jobs to send emails or other jobs that were related to the request regardless of what they were. All of this still ties up the worker that could be handling the request.

Also, the number of requests you get has nothing to do with the number of requests you can actually process. You could be able to process 100k requests per second, but only get 200/second, or vice versa.

1 more reply

j / k navigate · click thread line to collapse