Do you perform load-testing on your services? If so; How, and what are your experiences/2 cents on the issue?
At the time, due to the company I worked at it, and because the service itself was a C#/.NET service it made sense to be writing tests in C#/.NET using Microsoft's Visual Studio Test Framework. Visual Studio, I think starting from the 2013 version has a built in load testing capability that integrates with Azure (you have to create an account) that will automatically spin up instances temporarily for you in order to load test the endpoint(s) you specify using the parameters you provide in VS created web tests. It worked rather well for us. I understand this is highly specific to our use case but if you didn't know this existed it's something to be aware of.
We did it with Gatling (a very known tool in the Java world) in some of our local machines requesting our machines in Amazon EC2.
As it was commented previously, the test depends on various points, to tell a few: * Do you want to test a complete web or just a few services? * The speed of your system depends on the number of instances or there are other points to look like the database connection? * Do you want to test vertical scale or you want to know about the horizontal scaling too?
We did the tests mainly to know the "breaking point" of our database that was our "weakest link".
Anyway, stress tests are quite useful and give you an idea about how much your system can scale and helps to detect where to improve code before expending too much money in servers.
At first those 4 machines were hitting the database so much that it was the bottleneck, what we did to solve that was to reduce the number of accesses done by the front machines using a memory cache. It can increase the latency when loading the cache but works fine 95% of the time.
Another option for us could be a larger database machine, but we were using Amazon RDS ... believe me you want to keep that machine as small as possible or expend a lot of money :-)
We used two machines to simulate up to 600 requests per second, that was much more than our average use, actually much much more. And we measured it without HAProxy or Varnish or similar systems between our servers and the clients.