The network card also has hardware limits in the BW that it can handle, its latency. It is connected with the CPU via PCI-e usually, which has also latency and bandwidths, etc.
All this go to the CPU, which has latencies and BW from the different caches and DRAM, etc.
So you should be able to model what's the theoretical maximum of request that the network can handle, and then the network interface, the PCI-e bus, etc. up to DRAM.
The amount that they can handle differs, so the bottleneck is going to be the slowest part of the chain.
For example, as an extremely simplified example, say you have a 100 GB/s network, connected to a network adapter that can handle 200GB/s, connected with PCI-e 3 to the CPU at 12GB/S, which is connected with DRAM at 200GB/s.
If each request has to receive or send 1 GB, then you can at most handle 12 req/s because that's all what your PCI-e bus can support.
If you are then delivering 1 reqs/s then either your "model" is wrong, or your app is poorly implemented.
If you are then delivering 11 req/s, then either your "model" is wrong, or your app is well implemented.
But if you are far away from your model, e.g., at 1 reqs/s, you can still validate your model, e.g., by using two PCI-e bus, which you then expect to be 2x as fast. Maybe your data about your PCI-e bw is incorrect, or you are not understanding something about how the packets get transfer, but the model guides you through the hardware bottlenecks.
The blog post lacks a "model", and focus on "what the software does" without ever putting it into the context of "what the hardware can do".
That is enough to allow you to compare whether software A is faster than software B, but if you are the fastest, it doesn't tell you how far can you go.