Garbage Collection in Ruby 2.1 (opens in new tab)

(tmm1.net)

163 pointstmm112y ago30 comments

30 comments

21 comments · 7 top-level

stcredzero12y ago· 5 in thread

So Ruby objects occupy 40 bytes each, inside pages on the eden heap.

That's pretty huge. That's several times larger than some other dynamic languages -- languages that have essentially the same features as Ruby.

phamilton12y ago

Not really. Python primitives are smaller (24 bytes). Python strings are 37 + 1 byte per character. Python objects are 72 bytes. Python classes are 104 bytes. (Got these numbers playing around with sys.getsizeof in the python console.)

Primitives are a bit bigger in ruby, but they aren't really primitives anyway (everything is an object).

stcredzero12y ago

Python is another offender!

1 more reply

rurban12y ago

Perl objects are even bigger and need one indirection per access. A perl head is 32 bytes plus 48 byte body, ie. 80byte on 64bit systems.

For my perl rewrite p2 I need one single word per object on the stack. Common Lisps usually needs two words, the class pointer and the tagged value.

I really liked the visualization of the various GC's btw. Excellent work!

riffraff12y ago

I have no idea of the size in other language implementations, but notice 40 bytes is on x86-64, on 32 bit it's either 20 or 24.

joevandyk12y ago

this is why running large ruby apps on 64 bit OSs can take up a lot more memory than 32 bit.

1 more reply

AlexanderDhoore12y ago· 4 in thread

It's interesting to compare this to how python does it (reference counting with generations): http://patshaughnessy.net/2013/10/30/generational-gc-in-pyth...

"At first glance, Ruby and Python seem to implement garbage collection very differently. Ruby uses John McCarthy’s original mark and sweep algorithm, while Python uses reference counting. But when we look more closely, we see that Python uses bits of the mark and sweep idea to handle cyclic references, and that both Ruby and Python use generational garbage collection in similar ways. Python uses three separate generations, while Ruby 2.1 uses two.

This similarity should not be a surprise. Both languages are using computer science research that was done decades ago – before either Ruby or Python were even invented. I find it fascinating that when you look “under the hood” at different programming languages, you often find the same fundamental ideas and algorithms are used by all of them. Modern programming languages owe a great deal to the ground breaking computer science research that John McCarthy and his contemporaries did back in the 1960s and 1970s."

oleganza12y ago

I think Henry Baker had a paper on why generational and refcounting GCs are ultimately equivalent. Couldn't find it quickly, though, to give you a link.

hencq12y ago

I think you might be thinking of this paper: A unified theory of garbage collection [1], which shows how most modern collectors are hybrids of GC and ref counting.

[1] http://atlas.cs.virginia.edu/~weimer/2008-415/reading/bacon-...

2 more replies

iagooar12y ago

Isn't it actually kind of sad, that such important researches don't get considered until 40 or 50 years later?

riffraff12y ago

I don't think such research wasn't considered, it's been used constantly in many projects since.

The reason why some platforms have "silly" choices like stop the world M&S GC, or interpreters with AST walking is, likely, that optimization wasn't a goal in the original implementation and it's hard to retrofit it while keeping compatibility.

venus12y ago· 3 in thread

What's this RailsApp.preload_all method mentioned in this post? Is that a github thing?

Great post btw. I am looking forward to getting some production apps onto 2.1 in the coming weeks and seeing how performance characteristics have changed in the new release.

tmm1OP12y ago

We implement preload_all, which loops over app/{models,controllers}/ and requires everything. This method is generally called from config.ru, and happens in the unicorn master before it forks off workers.

I recently upstreamed a warmup method for Rack::Builder you can also use for this purpose: https://github.com/rack/rack/pull/617

venus12y ago

Thanks for that link (and for submitting it!). I've done something similar in deploy scripts, to warm up a fresh instance before bringing it into rotation - even better having it able to be baked into rack like that.

purephase12y ago

That's interesting. Thanks for sharing. I'll have to check this out.

jeffdavis12y ago· 2 in thread

To me, it's interesting that after all this time and with the prevalence of languages that rely heavily on garbage collection, garbage collectors still don't seem to scale to large heaps.

Large heaps mean long pauses when doing the "mark" phase of the generation holding most of the objects. That seems like a big problem that will increase as memory sizes increase.

Maybe the way I'm looking at it is too simplistic and there are better methods now. But even in Java, which has had plenty of time to work these issues out, I have seen issues with long GC pauses.

My impression right now is that GCs just don't scale to large heaps. To use more memory, you need to either manage memory yourself (which not a lot of modern languages allow), or increase the number of independent heaps (by using more tasks/processes).

Please enlighten me if I'm wrong here.

eonwe12y ago

That seems to be the case in Oracle Java at least. With 80+ GB heaps, you can experience multi-minute stop-the-world collections when a full GC strikes. Luckily, that can be tuned to be very rare (for the usual load).

For better latency without such hiccups, the only solution I know of is Azul's Zing (http://www.azulsystems.com/zing/pgc) which really did away with larger pauses at least with our software.

jeffdavis12y ago

What is "the usual load"?

Interesting; I'll check out the Azul one. I'm a little skeptical, but it might be an improvement. Did you see better throughput or just reduced latency?

1 more reply

RyanZAG12y ago

Also interesting to compare this with how the G1 collector for the JVM handles it: http://www.infoq.com/articles/G1-One-Garbage-Collector-To-Ru...

EDIT: Better link: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.63....

rurban12y ago

The important number is the avg pause time. Here with ruby 2.1 7ms for a minor and 58ms for a major sweep for a typical ~500K heap app.

In my potion-based GC (a stack-scanning, compacting, cheney two-finger GC) the avg GC needs 3ms, but it's not concurrent (multi-threaded) yet.

All of these can be considered real-time, i.e. < 15ms. For bigger heaps the scans need to be incremental (saving and restoring GC state, which is easy with libgc or cheney).

The best java GC I found needed 150ms pause time. Good lisp's have real-time GCs.

ksec12y ago

And all these are all very similar to Google's Chrome V8 and Mozilla's SpiderMonkey. Although both had incremental done already, for Ruby that is scheduled for v2.2.

Now just when will Ruby get a Method or Tracing JIT?

j / k navigate · click thread line to collapse

30 comments

21 comments · 7 top-level

stcredzero12y ago· 5 in thread

So Ruby objects occupy 40 bytes each, inside pages on the eden heap.

That's pretty huge. That's several times larger than some other dynamic languages -- languages that have essentially the same features as Ruby.

phamilton12y ago

Primitives are a bit bigger in ruby, but they aren't really primitives anyway (everything is an object).

stcredzero12y ago

Python is another offender!

1 more reply

rurban12y ago

Perl objects are even bigger and need one indirection per access. A perl head is 32 bytes plus 48 byte body, ie. 80byte on 64bit systems.

For my perl rewrite p2 I need one single word per object on the stack. Common Lisps usually needs two words, the class pointer and the tagged value.

I really liked the visualization of the various GC's btw. Excellent work!

riffraff12y ago

I have no idea of the size in other language implementations, but notice 40 bytes is on x86-64, on 32 bit it's either 20 or 24.

joevandyk12y ago

this is why running large ruby apps on 64 bit OSs can take up a lot more memory than 32 bit.

1 more reply

AlexanderDhoore12y ago· 4 in thread

It's interesting to compare this to how python does it (reference counting with generations): http://patshaughnessy.net/2013/10/30/generational-gc-in-pyth...

oleganza12y ago

I think Henry Baker had a paper on why generational and refcounting GCs are ultimately equivalent. Couldn't find it quickly, though, to give you a link.

hencq12y ago

I think you might be thinking of this paper: A unified theory of garbage collection [1], which shows how most modern collectors are hybrids of GC and ref counting.

[1] http://atlas.cs.virginia.edu/~weimer/2008-415/reading/bacon-...

2 more replies

iagooar12y ago

Isn't it actually kind of sad, that such important researches don't get considered until 40 or 50 years later?

riffraff12y ago

I don't think such research wasn't considered, it's been used constantly in many projects since.

venus12y ago· 3 in thread

What's this RailsApp.preload_all method mentioned in this post? Is that a github thing?

Great post btw. I am looking forward to getting some production apps onto 2.1 in the coming weeks and seeing how performance characteristics have changed in the new release.

tmm1OP12y ago

I recently upstreamed a warmup method for Rack::Builder you can also use for this purpose: https://github.com/rack/rack/pull/617

venus12y ago

purephase12y ago

That's interesting. Thanks for sharing. I'll have to check this out.

jeffdavis12y ago· 2 in thread

To me, it's interesting that after all this time and with the prevalence of languages that rely heavily on garbage collection, garbage collectors still don't seem to scale to large heaps.

Large heaps mean long pauses when doing the "mark" phase of the generation holding most of the objects. That seems like a big problem that will increase as memory sizes increase.

Maybe the way I'm looking at it is too simplistic and there are better methods now. But even in Java, which has had plenty of time to work these issues out, I have seen issues with long GC pauses.

Please enlighten me if I'm wrong here.

eonwe12y ago

For better latency without such hiccups, the only solution I know of is Azul's Zing (http://www.azulsystems.com/zing/pgc) which really did away with larger pauses at least with our software.

jeffdavis12y ago

What is "the usual load"?

Interesting; I'll check out the Azul one. I'm a little skeptical, but it might be an improvement. Did you see better throughput or just reduced latency?

1 more reply

RyanZAG12y ago

Also interesting to compare this with how the G1 collector for the JVM handles it: http://www.infoq.com/articles/G1-One-Garbage-Collector-To-Ru...

EDIT: Better link: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.63....

rurban12y ago

The important number is the avg pause time. Here with ruby 2.1 7ms for a minor and 58ms for a major sweep for a typical ~500K heap app.

In my potion-based GC (a stack-scanning, compacting, cheney two-finger GC) the avg GC needs 3ms, but it's not concurrent (multi-threaded) yet.

All of these can be considered real-time, i.e. < 15ms. For bigger heaps the scans need to be incremental (saving and restoring GC state, which is easy with libgc or cheney).

The best java GC I found needed 150ms pause time. Good lisp's have real-time GCs.

ksec12y ago

And all these are all very similar to Google's Chrome V8 and Mozilla's SpiderMonkey. Although both had incremental done already, for Ruby that is scheduled for v2.2.

Now just when will Ruby get a Method or Tracing JIT?

j / k navigate · click thread line to collapse