Some benchmarks are designed to measure absolute performance, in order to answer questions like "How many servers should we buy, if we expect to handle X hits/second?" or "What's the limit of our app stack's scalability, in hits/second?" But... the O.P is NOT benchmarking for absolute performance.
But the O.P. here is benchmarking toward a different purpose: comparing relative performance. These tests are designed to answer a totally different kind of question: "Which of these app stacks performs best, given the same hardware budget for each stack?" or "How many extra servers do we need to buy, if we want to use Apache/mod_php instead of Nginx/php-fpm?"
With relative-performance benchmarks, we have to assume that it's valid to extrapolate from small servers to large, and from one server to many. That is, if Ngnix beats Apache by 1000% on a lonely 500MHz Pentium 3 box, what can we predict about NGinx vs. Apache performance on a dozen-strong cluster of quad-core, dual-socket 3.6GHz machines? In more general terms: How well does each application "scale up/out, horizontally"?
The answer depends on the application type and software architecture. For example, most modern web servers are multi-thread/process apps, with minimal shared state in between. Also, modern web stacks generally push cross-request state into a separate datastore layer (if any). As a result, modern web apps tend to scale up linearly, to the performance limits of the datastore layer. Until your database becomes a bottleneck, you can expect that 2x web servers == 2x hits/second.
You need to have multiple (powerful) machines for this. And also, spinning up machines in the cloud is quite easy to do and allows people to reproduce the same test results because you have access to exactly the same environment.
Oh really? For simple http 'hello world' comparision (where you are interested in relative numbers, not absolute ones) bechmark?
All you need is one old and slow laptop (with test contenders) and one modern and mighty (with test script). The only thing you have to be sure about is the test script can generate more load than test contenders can handle. Even if the old laptop isn't slow enough you can just add some predictable and stable load to cpu/disks/network/whatever is a bottleneck for them - you may use tools that are available for that or even quick & dirty hacks like one-liner 'while(1) {do some math}' that effectively make your 2-cores CPU 1-core while running with high system priority.
I haven't worked on these in a couple years, but on real hardware, haproxy could push much more bandwidth. We could saturate 10Gb ethernet fairly easily at the time, which wasn't possible at all with nginx.
What the option does is close the connection between the proxy and the backend so that HAProxy will analyse further requests instead of just forwarding to the already established connection.
To be fair, I don't know what nginx does - whether connections are kept open or shut down - so I'm not sure that it'd be a fair comparison.
Also interesting are the HAProxy built in SSL times. I'm surprised they're so slow. Perhaps the cipher is also the culprit. The cipher can also be specified in HAProxy.
bind *:8080 ssl crt /root/balancerbattle/ssl/combined.pem ciphers RC4-SHA:AES128-SHA:AES:!ADH:!aNULL:!DH:!EDH:!eNULLIs this in the nginx config? Can anybody elaborate a bit further? Here is what I am currently using in my nginx config for ssl:
ssl_session_cache shared:SSL_CACHE:8m;
ssl_session_timeout 5m;
# Mitigate BEAST attacks
ssl_ciphers RC4:HIGH:!aNULL:!MD5;
ssl_prefer_server_ciphers on; openssl s_client -host localhost -port 8082
Which is a openssl command. These settings were used for testing SSL: https://github.com/observing/balancerbattle/blob/master/ngin...See https://gist.github.com/3rd-Eden/5345018 for the output of the openssl s_client for those ciphers. You'll see that `cipher : RC4-SHA` is used here. Which is one of the fastest if not the fastest cipher available.
% ping -c 10 -A 4.2.2.1
PING 4.2.2.1 (4.2.2.1) 56(84) bytes of data.
64 bytes from 4.2.2.1: icmp_req=1 ttl=56 time=3.24 ms
64 bytes from 4.2.2.1: icmp_req=2 ttl=56 time=2.88 ms
64 bytes from 4.2.2.1: icmp_req=3 ttl=56 time=2.95 ms
64 bytes from 4.2.2.1: icmp_req=4 ttl=56 time=2.90 ms
64 bytes from 4.2.2.1: icmp_req=5 ttl=56 time=2.95 ms
64 bytes from 4.2.2.1: icmp_req=6 ttl=56 time=2.91 ms
64 bytes from 4.2.2.1: icmp_req=7 ttl=56 time=2.90 ms
64 bytes from 4.2.2.1: icmp_req=8 ttl=56 time=2.87 ms
64 bytes from 4.2.2.1: icmp_req=9 ttl=56 time=2.94 ms
64 bytes from 4.2.2.1: icmp_req=10 ttl=56 time=2.94 ms
--- 4.2.2.1 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 1806ms
rtt min/avg/max/mdev = 2.875/2.952/3.247/0.112 ms, ipg/ewma 200.705/3.019 ms
I can't imagine why a local websocket echo service should be 5x slower that this.You can enable TLS 1.2 ciphers, but there is a large percentage of clients unable to use them, so the fallback is RC4.
> At the moment, the attack is not yet practical because it requires access to millions and possibly billions of copies of the same data encrypted using different keys
So we have some time.
Something tells me that perhaps Tod Sul is doing something wrong.
From an operations standpoint, haproxy has other features (failover, cli management, clustering) that actually makes it a much better load balancer. I usually install all three haproxy, stud, nginx because they are each very good in their specific niche. As for the simplicity of installation, that can be handled with a configuration manager.
HAProxy also has a raw tcp mode, which is great when you have to balance non-http services.
source: https://github.com/observing/balancerbattle#benchmarking