Ruby on Rails load testing habits (opens in new tab)

(rorvswild.com)

128 pointsa12b2y ago43 comments

43 comments

27 comments · 8 top-level

anonacct372y ago· 7 in thread

> My initial requirement was to send requests with unique parameters. To the best of my knowledge, no tool could do this.

wrk does this with lua. https://github.com/wg/wrk/blob/master/src/wrk.lua

Also even things like the venerable jmeter supported pulling parameters from a csv file.

nomilk2y ago

I've never done load testing before, but would it be hard to write a script in pure ruby (maybe with a few libraries) that makes a lot of concurrent requests to whatever endpoints and using whatever params you like?

hassy2y ago

Don't write your own load testing tool other than as a fun little exercise. At least not without understanding coordinated omission and thinking about workload modeling (open? closed? hybrid? all of the above?) [1]. Get this wrong and the results produced by your tool will be worthless.

Once you've got that out of the way, don't forget that you'll want a distribution story. It does not matter how efficient your tool might be on a single machine - you'll want to distribute your tests across multiple clients for real-world testing.

"Sure it's easy" you might say, "I know UNIX. Give me pssh and a few VMs on EC2". Well, now you've got 2 problems: aggregating metrics from multiple hosts and merging them accurately (especially those pesky percentiles. Your tool IS reporting percentiles rather than averages already, right?!), and a developer experience problem - no one wants to wrangle infra just to run a load test, how are you going to make it easier?

And, this developer experience problem is much bigger than just sorting out infra... you'll probably want to send the metrics produced by your tool to external observability systems. So now you've got some plugins to write (along with a plugin API). The list goes on.

I'm very biased, but it's 2024. Don't waste your time and just use https://www.artillery.io/

1. https://www.artillery.io/blog/load-testing-workload-models

1 more reply

anonacct372y ago

There's a lot of subtleties. It's really easy to accidentally load test the wrong part of your web application due to differences in compression, cache hit ratios, http settings, etc.

Shameless self promotion but I wrote up a bunch of these issues in a post describing all the mistakes I have made so you can learn from them: https://shane.ai/posts/load-testing-tips/

1 more reply

toast02y ago

Load testing can be harder on the client side than the server side at high loads, which might be surprising, but consider how many servers need to handle large numbers of clients connecting and how few clients need to connect to large number of servers --- if you do it yourself, you're going to have to relearn all the techniques to make large number of connections possible.

If your desired load for testing is small, it's not a big deal, of course.

paulryanrogers2y ago

Depends on needs and familiarity. I used Bash and Curl, then PHP, then settled on Jmeter for a tricky bug due to a memory leak from a race under load. All three could reproduce the problem. Jmeter had quite a learning curve, and preparing data for it was a pain. Ultimately though it was more reliable, can do distributed testing, and has lots of nice built in features and analytics.

vidarh2y ago

It is simple enough to do a script doing that in Ruby, but there also enough off the shelf tools specifically for load testing that encodes a lot of experience and avoids a lot of obvious and not so obvious mistakes, so it's usually not worth writing your own.

liveoneggs2y ago

there is no reason to write your own

Hendrikto2y ago· 7 in thread

> If the application can’t saturate the CPU, there’s a fundamental problem. It’s a shame because it makes adding more servers less efficient. Money is being wasted on hosting costs, and this should be a priority to address.

Talking about efficiency being a priority, but using RoR. I guess that is one way of saturating the CPU.

eek21212y ago

I have never seen Ruby on Rails be a bottleneck, and I have been using it since version 2.

Most bottlenecks are either that database choices or poor code/design choices by developers. That is especially true today.

ndriscoll2y ago

Doesn't rails not have multithreading? How do you gather requests to batch your database calls?

A coworker made similar claims to me about Laravel, but the framework really encourages you to do half a dozen database queries in even a pretty minimal request, and for example implemented bulk inserts as a for loop that did single inserts. If you didn't know better with an access pattern like that, you might think the database is the bottleneck long before it actually should be. Is Rails different? My sense was they are very similar.

1 more reply

Rapzid2y ago

I'm biased as I usually come on to the scene because a company has a successful product and is experiencing engineering scaling pains.

That said, in my experience, CPU is often the ultimate bottleneck with PHP, Ruby, Python, and.. Like everything. Over the years serializers have often been a pain point; XML in PHP and RoR, and the Rails "serializers" currently. Any sort of mapping or hydration(which is a LOT of what happens in web apps) is comparatively slow, often order of magnitude or more, over something like nodejs, C#, golang, and etc.

> Most bottlenecks are either that database choices or poor code/design choices by developers

Perhaps in sheer quantity, but with experience those are often low hanging fruit. After those are addressed you are left with the pain of the language and framework inefficiencies.

stouset2y ago

If your application can’t saturate the CPU, Rails probably isn’t your bottleneck.

1 more reply

block_dagger2y ago

The author means efficiency in terms of the software that is running, not efficiency in terms of all possible optimizations such as switching to a different lang.

irjustin2y ago

While you missed the point of the saturation comment, this is why we love AWS Lambda over ECS+Fargate.

Rails has really poor startup time due to loading all codepaths. We switched to Django and it runs beautifuly on AWS Lambda where our CI is more expensive than actual server costs. We're a b2b application so traffic is quite low so we REALLY don't saturate the CPU in a normal Fargate setup.

cocoflunchy2y ago

I'm surprised to see a mention of Django when talking about fast startup times! This is one of my main issues with Django at the moment. How big is your project?

Our ~500k lines app takes multiple seconds to start, which is why I'm not really investigating a lambda-style setup... Do you have specific strategies to make startup fast?

1 more reply

pistoriusp2y ago· 3 in thread

Having your test database mirror production as closely as possible is also an important habit, however I’m biased since that’s part of the offering that I’m building.

e12e2y ago

You might want to try and maintain a synthetic dataset for testing and staging that has the same "shape" as your production data - to avoid exposing sensitive data.

We're currently trying to have each rails model implement a #new_example method that builds a valid subgraph filled in by Faker, ready to save. Ie a

    user = User.new_example

will come with a Company.new_example if every user needs a company relationship.

Still early, we'll see how it goes.

pistoriusp2y ago

We're doing the same, but for the TypeScript world with "Snaplet Seed." We use generative AI to generate deterministic values + the required relational data: https://www.snaplet.dev/seed

We generate data based off of your database schema and your production data (if you give us access.)

Since you've kinda already built something like this I would be curious to hear what you think!

bdcravens2y ago

Wouldn't it be easier to do the same with FactoryBot? It'll similarly cascade creation of associated records.

1 more reply

rdoherty2y ago· 1 in thread

These are great tips for load testing any web service. Don't forget that you can also saturate memory, disk or network too, so watch those graphs too. I've seen load tests unable to saturate CPU because another resource was limited.

Also don't forget that you are load testing all the dependencies of your service. Database, caching tier, external services, etc. Make sure other teams are aware!

Also nothing beats real world traffic. Users' connections will stay open longer than a synthetic tool may hold them open due to bandwidth, they make very random, sporadic requests too. Your service will behave very differently under large amounts of real world traffic vs synthetic.

Other options if you are running multiple web servers is to shift traffic around to increase traffic to 1 host and see where it fails. That is usually a very reliable signal for peak load.

And don't forget to do this on a schedule as your codebase (and your dependencies codebases) changes!

wdfx2y ago

> external services, etc. Make sure other teams are aware!

Guilty. I've had one of our partners call me one time because I'd caused a huge load on their end. Lots of apologies and embarrassment followed.

Later I mocked out those external calls with stubs which behaved similarly, in that I could specify the min/max/average wait times and error rates.

sfc322y ago· 1 in thread

RoRvsWild is a pretty cool product, and built by 2 of the nicest guys in the Ruby ecosystem.

antoinem2y ago

Aww, thank you! Who are you, lovely anonymous friend?

mplewis2y ago

If you're looking to run programmable load test scenarios as described in this blog post, consider checking out https://k6.io/, which I've found to be extremely easy to use and powerful.

ericb2y ago

If you'd like to write your load tests in Ruby (plain ruby or browser based), and use your own libraries (internal libraries, specs, ets) browserup does that:

Ruby with browser:

https://browserup.com/docs/en/load/ruby-load-test.html

Command-line installation:

https://www.npmjs.com/package/browserup

rubyissimo2y ago

fwiw, I know it's the right thing to run tests from a different computer. But it's more annoying. And I hereby tell you that 90% of the time it probably doesn't matter.

Definitely times it isn't true. But if you're not doing a load test bc it's a pita, do it locally. Most of the time I've wanted to do this, all the action is inside the app. Just be careful to acknowledge that there could be limitations / surprises.

j / k navigate · click thread line to collapse

43 comments

27 comments · 8 top-level

anonacct372y ago· 7 in thread

> My initial requirement was to send requests with unique parameters. To the best of my knowledge, no tool could do this.

wrk does this with lua. https://github.com/wg/wrk/blob/master/src/wrk.lua

Also even things like the venerable jmeter supported pulling parameters from a csv file.

nomilk2y ago

hassy2y ago

I'm very biased, but it's 2024. Don't waste your time and just use https://www.artillery.io/

1. https://www.artillery.io/blog/load-testing-workload-models

1 more reply

anonacct372y ago

There's a lot of subtleties. It's really easy to accidentally load test the wrong part of your web application due to differences in compression, cache hit ratios, http settings, etc.

Shameless self promotion but I wrote up a bunch of these issues in a post describing all the mistakes I have made so you can learn from them: https://shane.ai/posts/load-testing-tips/

1 more reply

toast02y ago

If your desired load for testing is small, it's not a big deal, of course.

paulryanrogers2y ago

vidarh2y ago

liveoneggs2y ago

there is no reason to write your own

Hendrikto2y ago· 7 in thread

Talking about efficiency being a priority, but using RoR. I guess that is one way of saturating the CPU.

eek21212y ago

I have never seen Ruby on Rails be a bottleneck, and I have been using it since version 2.

Most bottlenecks are either that database choices or poor code/design choices by developers. That is especially true today.

ndriscoll2y ago

Doesn't rails not have multithreading? How do you gather requests to batch your database calls?

1 more reply

Rapzid2y ago

I'm biased as I usually come on to the scene because a company has a successful product and is experiencing engineering scaling pains.

> Most bottlenecks are either that database choices or poor code/design choices by developers

Perhaps in sheer quantity, but with experience those are often low hanging fruit. After those are addressed you are left with the pain of the language and framework inefficiencies.

stouset2y ago

If your application can’t saturate the CPU, Rails probably isn’t your bottleneck.

1 more reply

block_dagger2y ago

The author means efficiency in terms of the software that is running, not efficiency in terms of all possible optimizations such as switching to a different lang.

irjustin2y ago

While you missed the point of the saturation comment, this is why we love AWS Lambda over ECS+Fargate.

cocoflunchy2y ago

I'm surprised to see a mention of Django when talking about fast startup times! This is one of my main issues with Django at the moment. How big is your project?

Our ~500k lines app takes multiple seconds to start, which is why I'm not really investigating a lambda-style setup... Do you have specific strategies to make startup fast?

1 more reply

pistoriusp2y ago· 3 in thread

Having your test database mirror production as closely as possible is also an important habit, however I’m biased since that’s part of the offering that I’m building.

e12e2y ago

You might want to try and maintain a synthetic dataset for testing and staging that has the same "shape" as your production data - to avoid exposing sensitive data.

We're currently trying to have each rails model implement a #new_example method that builds a valid subgraph filled in by Faker, ready to save. Ie a

    user = User.new_example

will come with a Company.new_example if every user needs a company relationship.

Still early, we'll see how it goes.

pistoriusp2y ago

We're doing the same, but for the TypeScript world with "Snaplet Seed." We use generative AI to generate deterministic values + the required relational data: https://www.snaplet.dev/seed

We generate data based off of your database schema and your production data (if you give us access.)

Since you've kinda already built something like this I would be curious to hear what you think!

bdcravens2y ago

Wouldn't it be easier to do the same with FactoryBot? It'll similarly cascade creation of associated records.

1 more reply

rdoherty2y ago· 1 in thread

Also don't forget that you are load testing all the dependencies of your service. Database, caching tier, external services, etc. Make sure other teams are aware!

Other options if you are running multiple web servers is to shift traffic around to increase traffic to 1 host and see where it fails. That is usually a very reliable signal for peak load.

And don't forget to do this on a schedule as your codebase (and your dependencies codebases) changes!

wdfx2y ago

> external services, etc. Make sure other teams are aware!

Guilty. I've had one of our partners call me one time because I'd caused a huge load on their end. Lots of apologies and embarrassment followed.

Later I mocked out those external calls with stubs which behaved similarly, in that I could specify the min/max/average wait times and error rates.

sfc322y ago· 1 in thread

RoRvsWild is a pretty cool product, and built by 2 of the nicest guys in the Ruby ecosystem.

antoinem2y ago

Aww, thank you! Who are you, lovely anonymous friend?

mplewis2y ago

If you're looking to run programmable load test scenarios as described in this blog post, consider checking out https://k6.io/, which I've found to be extremely easy to use and powerful.

ericb2y ago

If you'd like to write your load tests in Ruby (plain ruby or browser based), and use your own libraries (internal libraries, specs, ets) browserup does that:

Ruby with browser:

https://browserup.com/docs/en/load/ruby-load-test.html

Command-line installation:

https://www.npmjs.com/package/browserup

rubyissimo2y ago

fwiw, I know it's the right thing to run tests from a different computer. But it's more annoying. And I hereby tell you that 90% of the time it probably doesn't matter.

j / k navigate · click thread line to collapse