undefined | Better HN

0 pointsvolta835y ago0 comments

Your network links supports certain throughput and latencies depending on the packet sizes. Your vendor should tell you what these are, and provide you with benchmarks to reproduce their claims (OSU reproduces these for, e.g., MPI).

The network card also has hardware limits in the BW that it can handle, its latency. It is connected with the CPU via PCI-e usually, which has also latency and bandwidths, etc.

All this go to the CPU, which has latencies and BW from the different caches and DRAM, etc.

So you should be able to model what's the theoretical maximum of request that the network can handle, and then the network interface, the PCI-e bus, etc. up to DRAM.

The amount that they can handle differs, so the bottleneck is going to be the slowest part of the chain.

For example, as an extremely simplified example, say you have a 100 GB/s network, connected to a network adapter that can handle 200GB/s, connected with PCI-e 3 to the CPU at 12GB/S, which is connected with DRAM at 200GB/s.

If each request has to receive or send 1 GB, then you can at most handle 12 req/s because that's all what your PCI-e bus can support.

If you are then delivering 1 reqs/s then either your "model" is wrong, or your app is poorly implemented.

If you are then delivering 11 req/s, then either your "model" is wrong, or your app is well implemented.

But if you are far away from your model, e.g., at 1 reqs/s, you can still validate your model, e.g., by using two PCI-e bus, which you then expect to be 2x as fast. Maybe your data about your PCI-e bw is incorrect, or you are not understanding something about how the packets get transfer, but the model guides you through the hardware bottlenecks.

The blog post lacks a "model", and focus on "what the software does" without ever putting it into the context of "what the hardware can do".

That is enough to allow you to compare whether software A is faster than software B, but if you are the fastest, it doesn't tell you how far can you go.

0 comments

7 comments · 2 top-level

slver5y ago· 3 in thread

Handling request response isn’t just about packet count. I might as well claim it’s all just electric current and short some wires for max throughput /s

gpderetta5y ago

The computation given by parent allows you to compute upper bounds and order of magnitude estimates. He is correct that you need these values to guide your optimizations.

volta83OP5y ago

Sure, its more complex than that, and an accurate model would be more complex as well.

But hey, doing science[0] is hard, better not be scientific instead /s

[1] science as in the scientific method: model->hypothesis->test , improve model->iterate. In contrast to the "shoot gun", or like the blog author called it, "whack-a-mole" method: try many things, be grateful if one sticks, no ragrets. /s

slver5y ago

Doing science is great, but first we need to make sure we're not comparing apples and oranges.

OP has defined the problem as speeding up an HTTP server (libreactor based) on Linux. So that's a context we assume as a base, questions like "what can the hardware do without libreactor and without Linux" are not posed here.

1 more reply

jiggawatts5y ago· 2 in thread

I had this literal debate with a "network engineer" that was trying to convince me that 14 Mbps coming out of a Windows box with dual 10 Gbps NICs was expected. You know... because "Windows is slow"!

I aim for 9 Gbps per NIC, but I still see people settling for 3 Gbps total as if that's "normal".

hansel_der5y ago

> but I still see people settling for 3 Gbps total as if that's "normal".

y'know - it might be enough

jiggawatts5y ago

It might also be 15% of what you purchased.

j / k navigate · click thread line to collapse