Everyday performance rules for Ruby on Rails developers (opens in new tab)

(rorvswild.com)

132 pointsa12b2y ago40 comments

40 comments

37 comments · 14 top-level

resonious2y ago· 7 in thread

Good stuff but the `size`, `count`, `length` section just intensifies my dislike for ORMs. ORMs bury all of the SQL, just for devs to dig it back up when they realize it's important for performance. Now you have to be a SQL expert and an ActiveRecord expert.

vemv2y ago

This particular is easy to pick up and remember.

Also, a proficient developer looks at the SQL logs anyway as they develop a given feature.

Rails' flavor of ORM is particularly composable and transparent - you can easily mix/match it with vanilla SQL.

resonious2y ago

I guess this is a fair point. It is easy to use SQL with AR. But when you do so, you get lambasted in code review by people saying "you can generate this same SQL with this arcane Arel incantation!!". But that is certainly a culture problem and not a technical one!

andatki2y ago

I tend to agree as a SQL enthusiast. However I have yet to see a Rails team that doesn’t use Active Record or writes much SQL directly, or by default, across 100s of apps. I’m sure it happens but in my experience it’s rare.

This is a place where I think tools like Rubocop help. They can be configured to point out method swaps like this (size over count) automatically which is a relatively low effort task to change the code.

With those rules/linting in place, you aren’t throwing out the benefits of AR (ORM), and hopefully leveraging their useful methods like these that help avoid unnecessary queries.

ckdot22y ago

Please, please don't mix up ORMs with ActiveRecords. ActiveRecords are one way to implement an ORM, but it's not the only way. I think many say they hate ORMs when they actually mean ActiveRecords. For bigger projects ActiveRecords suck, yes. But also you need to have some database layer logic which most likely does some Object & Relation Mapping (ORM).

phendrenad22y ago

I disagree. POROs are the way to go.

allknowingfrog2y ago

I don't want to live in a world where I can't pop into a Rails console and run something like `Foo.joins(:bars).any?`.

More concretely, performance is on my list of "good problems to have". Businesses die in the time it takes to write raw SQL.

mrtesthah2y ago

sometimes performance can be so bad that the business can’t launch in the first place.

1 more reply

andatki2y ago· 4 in thread

Really solid list! I went through all the Active Record, query design, and index design tips for PostgreSQL, and can +1 them all. Nice work.

For readers who want all of these and more in book form, with a sample Rails app and big data to test with (generated), please consider my book:

High Performance PostgreSQL for Rails https://news.ycombinator.com/item?id=38407585

The book helps readers build database skills with the overall purpose of improved performance and scalability.

Again, great, concise article. I’ll be recommending it to others and it will help a lot of developers!

Thanks!

a12bOP2y ago

Thanks. I bought your book since the B1.0 version and I recommend it.

andatki2y ago

Thank you! Beta book period is nearly wrappped up!

Vayl2y ago

Awesome, thanks for plugging this here! Exactly the kind of material I was looking for :)

andatki2y ago

Great! I wasn’t sure whether to plug it but it seemed very relevant to the post. :)

sunshine_reggae2y ago· 4 in thread

> We can’t think of any good reason to do without [CDNn]

CDNs are another way to track everybody. So privacy is an excellent reason for not using a CDN.

dmurray2y ago

It's also an extra point of failure, an extra account to manage, etc. I have seen plenty of low-traffic Rails (and other) apps where neither hosting cost nor performance would be significantly improved by adding a CDN.

jon-wood2y ago

I'll add cache invalidation to the list of reasons not to use a CDN. Its a solvable problem, but I've more than once seen irritating issues caused by something getting cached in the CDN layer and loading the wrong resources.

c0balt2y ago

Or simpler, why do you need a CDN? It's rarely worth the additional work (setup + deployment) when most websites are not limited by bandwidth for assets.

vemv2y ago

If you're serving assets directly from S3 (or even from nginx) you're exposed to a "denial of wallet" attack, given the price markup on outbound networking.

2 more replies

lukeasrodgers2y ago· 3 in thread

Some good advice here, but the “don’t index boolean columns” needs an “it depends” caveat, since Postgres will sometime use multiple boolean indexes to perform a bitmap index scan, which can be advantageous.

silvestrov2y ago

If the boolean values are not 50%/50% but skewed like 1%/99% then there can be a big advantage in using a partial index

CREATE INDEX index_accounts_balance ON accounts (id) WHERE boolean_flag;

andatki2y ago

Correct, when the query conditions match the index conditions, and they both select a low proportion of the rows. For PostgreSQL and those unfamiliar with partial indexes, worth a read: https://www.postgresql.org/docs/current/indexes-partial.html

andatki2y ago

I like how the PostgreSQL docs use the term “profitable” for whether an index helps.

The purpose of the index often is to lower the “cost” (greater profit), by providing more values the query is accessing (filtering, ordering, selecting) within the index entries, for faster retrieval.

When users have the skills to generate query plans and review whether an index supports a query, verifying that the index is picked by the planner, and indeed lowers the cost, then they can answer this question for their own unique combination of hardware, data distribution, queries, and indexes.

As generic advice, I think more often than not the index won’t be used for Boolean columns. But it’s generic advice and it does depend.

As you suggested, users must check their own system.

My book also covers the 2 and 3 value variations (nulls allowed) for Boolean columns.

For readers wanting to build these skills themselves, here’s more info:

https://news.ycombinator.com/item?id=38407585

Good nudge that it depends!

qrush2y ago· 2 in thread

Really glad to see more basic Ruby / Rails content showing up on HN!!!

Yhippa2y ago

This got me to take a look at the Rails site and go through the Getting Started tutorial. It's really well-written and is probably amongst the best documentation I've seen out there. I would love to do more projects in Rails now.

andatki2y ago

That's great! Hopefully the authors of Rails Guides and the Getting Started tutorial see this. I'll share it with a core member and ask them to reshare it. I'm sure they'd appreciate seeing their hard work get recognized, and would welcome for your feedback.

cattown2y ago· 2 in thread

The hint about using .pluck to only grab what you need from an ActiveRecord query is a pretty good one. I hand't realized you could do that.

I assume this is telling us it doesn't actually make an ActiveRecord instance out of each row when you do that. And instantiating big bunches of ActiveRecord model instances just to grab a few fields from a result set with a lot of rows can be sooo slow.

durkie2y ago

That's correct. If i run `User.limit(5).pluck(:id)` the query it runs is `SELECT "users"."id" FROM "users" LIMIT $1 [["LIMIT", 5]]` and returns an array, not an ActiveRecord association

hnatt2y ago

You can also do `User.limit(5).ids` which does the same thing.

VoodooJuJu2y ago· 1 in thread

Step 1: stop using Ruby on Rails

I can't even read the site with that text contrast. Why is illegibility a trend at all?

seattle_spring2y ago

Looks completely fine and legible. Maybe something is wrong with your settings or screen?

seanwilson2y ago

> Enable keep-alive connections. Keep-alive connections are reusable. They prevent having to re-establish a connection, as well as SSL negotiation. They reduce latency time for all pages made up of several resources.

Pretty sure this only applies to HTTP/1 and you'll get better performance with HTTP/2:

"Connection-specific header fields such as Connection and Keep-Alive are prohibited in HTTP/2 and HTTP/3"

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Ke...

"HTTP persistent connection, also called HTTP keep-alive, or HTTP connection reuse, is the idea of using a single TCP connection to send and receive multiple HTTP requests/responses, as opposed to opening a new connection for every single request/response pair. The newer HTTP/2 protocol uses the same idea and takes it further to allow multiple concurrent requests/responses to be multiplexed over a single connection."

https://en.wikipedia.org/wiki/HTTP_persistent_connection

bruce3434342y ago

> We can’t think of any good reason to do without them except for an application running solely on a private network.

Ever read those articles that explain half the internet is unavailable because of some random e.g. cloudflare outage? That.

thomasfl2y ago

These performance rules applies for all backend development. Use compression and caching, index foreign keys in your database and tune your sql queries.

MrBusch2y ago

Great list, but one caveat I'd add is this: While "SQL will always be faster than your code" is true, in the context of a sufficiently large app with many parallel requests the solution might still be to do some processing in the app because it can scale horizontally and (most) databases can only scale vertically and are thus more limited.

Fire-Dragon-DoL2y ago

Isn't gzip disabled with https a thing since forever due to a security issue?

pftg2y ago

Nice list. Some good and some of them are contr-productive like promoting of use of preloading and calc in memory.

mediumsmart2y ago

Excellent list, thank you for that. Implementing.

the CDN fauxpas is forgiven, nobody is perfect

j / k navigate · click thread line to collapse

40 comments

37 comments · 14 top-level

resonious2y ago· 7 in thread

vemv2y ago

This particular is easy to pick up and remember.

Also, a proficient developer looks at the SQL logs anyway as they develop a given feature.

Rails' flavor of ORM is particularly composable and transparent - you can easily mix/match it with vanilla SQL.

resonious2y ago

andatki2y ago

With those rules/linting in place, you aren’t throwing out the benefits of AR (ORM), and hopefully leveraging their useful methods like these that help avoid unnecessary queries.

ckdot22y ago

phendrenad22y ago

I disagree. POROs are the way to go.

allknowingfrog2y ago

I don't want to live in a world where I can't pop into a Rails console and run something like `Foo.joins(:bars).any?`.

More concretely, performance is on my list of "good problems to have". Businesses die in the time it takes to write raw SQL.

mrtesthah2y ago

sometimes performance can be so bad that the business can’t launch in the first place.

1 more reply

andatki2y ago· 4 in thread

Really solid list! I went through all the Active Record, query design, and index design tips for PostgreSQL, and can +1 them all. Nice work.

For readers who want all of these and more in book form, with a sample Rails app and big data to test with (generated), please consider my book:

High Performance PostgreSQL for Rails https://news.ycombinator.com/item?id=38407585

The book helps readers build database skills with the overall purpose of improved performance and scalability.

Again, great, concise article. I’ll be recommending it to others and it will help a lot of developers!

Thanks!

a12bOP2y ago

Thanks. I bought your book since the B1.0 version and I recommend it.

andatki2y ago

Thank you! Beta book period is nearly wrappped up!

Vayl2y ago

Awesome, thanks for plugging this here! Exactly the kind of material I was looking for :)

andatki2y ago

Great! I wasn’t sure whether to plug it but it seemed very relevant to the post. :)

sunshine_reggae2y ago· 4 in thread

> We can’t think of any good reason to do without [CDNn]

CDNs are another way to track everybody. So privacy is an excellent reason for not using a CDN.

dmurray2y ago

jon-wood2y ago

c0balt2y ago

Or simpler, why do you need a CDN? It's rarely worth the additional work (setup + deployment) when most websites are not limited by bandwidth for assets.

vemv2y ago

If you're serving assets directly from S3 (or even from nginx) you're exposed to a "denial of wallet" attack, given the price markup on outbound networking.

2 more replies

lukeasrodgers2y ago· 3 in thread

silvestrov2y ago

If the boolean values are not 50%/50% but skewed like 1%/99% then there can be a big advantage in using a partial index

CREATE INDEX index_accounts_balance ON accounts (id) WHERE boolean_flag;

andatki2y ago

I like how the PostgreSQL docs use the term “profitable” for whether an index helps.

As generic advice, I think more often than not the index won’t be used for Boolean columns. But it’s generic advice and it does depend.

As you suggested, users must check their own system.

My book also covers the 2 and 3 value variations (nulls allowed) for Boolean columns.

For readers wanting to build these skills themselves, here’s more info:

https://news.ycombinator.com/item?id=38407585

Good nudge that it depends!

qrush2y ago· 2 in thread

Really glad to see more basic Ruby / Rails content showing up on HN!!!

Yhippa2y ago

andatki2y ago

cattown2y ago· 2 in thread

The hint about using .pluck to only grab what you need from an ActiveRecord query is a pretty good one. I hand't realized you could do that.

durkie2y ago

That's correct. If i run `User.limit(5).pluck(:id)` the query it runs is `SELECT "users"."id" FROM "users" LIMIT $1 [["LIMIT", 5]]` and returns an array, not an ActiveRecord association

hnatt2y ago

You can also do `User.limit(5).ids` which does the same thing.

VoodooJuJu2y ago· 1 in thread

Step 1: stop using Ruby on Rails

I can't even read the site with that text contrast. Why is illegibility a trend at all?

seattle_spring2y ago

Looks completely fine and legible. Maybe something is wrong with your settings or screen?

seanwilson2y ago

Pretty sure this only applies to HTTP/1 and you'll get better performance with HTTP/2:

"Connection-specific header fields such as Connection and Keep-Alive are prohibited in HTTP/2 and HTTP/3"

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Ke...

https://en.wikipedia.org/wiki/HTTP_persistent_connection

bruce3434342y ago

> We can’t think of any good reason to do without them except for an application running solely on a private network.

Ever read those articles that explain half the internet is unavailable because of some random e.g. cloudflare outage? That.

thomasfl2y ago

These performance rules applies for all backend development. Use compression and caching, index foreign keys in your database and tune your sql queries.

MrBusch2y ago

Fire-Dragon-DoL2y ago

Isn't gzip disabled with https a thing since forever due to a security issue?

pftg2y ago

Nice list. Some good and some of them are contr-productive like promoting of use of preloading and calc in memory.

mediumsmart2y ago

Excellent list, thank you for that. Implementing.

the CDN fauxpas is forgiven, nobody is perfect

j / k navigate · click thread line to collapse