GC Fun at Twitch (opens in new tab)

(modularfinance.se)

64 points_raz6y ago43 comments

43 comments

32 comments · 14 top-level

dwohnitmok6y ago· 4 in thread

It's interesting to see how other GCs try to handle this. In particular the JVM's (awesome) new Shenandoah low-latency GC has a ShenandoahAllocSpikeFactor option precisely to deal with this kind of situation, where you can specify what percentage of the heap you're willing to sacrifice in a spike in allocation rates before the GC starts running wild trying to collect garbage.

The trade-off of the knob-less approach of Go's GC I suppose.

apta6y ago

The JVM offers state of the art GCs, which allows for selecting the best tool for the job (throughput, latency, large heaps, etc.).

This is unlike the golang gc, which is tuned for latency at the expense of throughput, with no way of modifying its behavior without resorting to hacks like the article in the post.

dwohnitmok6y ago

To Go's credit, it predates Shenandoah and ZGC, before which your only real option for low-latency GC on the JVM was Zing, which I don't think had too many people using it (I certainly have no personal experience with it). I can't say whether they were inspired by Go, but I do think that Go is responsible for bringing the desirability of low-latency GC, even at potentially high cost to throughput, to the forefront of the greater programming community's attention.

2 more replies

tus886y ago

Except when the best tool for the job is a language without GC at all.

Thaxll6y ago

Yes you can modify GC behaviour and it's one env variable, the fact that twitch didn't use it makes no sense.

1 more reply

zozbot2346y ago· 4 in thread

Why are people using GC in this day and age for anything other than processing on fully-general graphs (where the tracing and auto collecting is genuinely helpful)? Literally everything else can be dealt with by using more flexible memory management strategies, that do not need a pre-allocated 10GB heap, and will not hog cpu in wasteful and unpredictable ways when memory utilization rises above a set percentage.

pjmlp6y ago

Because GCs offer the best balance between performance and productive, even when going with the reference counting algorithms path.

Except for the heroic efforts from the Rust community, linear types are far from general consumption for any kind of software development.

Plus, having GC does not preclude being able to stack allocate, keep data on manual memory segment, or even resort to manually manage memory in unsafe code blocks.

Examples of GC enabled languages with such features, Modula-3, Mesa/Cedar, Active Oberon, Nim, D, Eiffel, C#, F#, System C# (M#), Sing#, Swift, ParaSail, Chapel.

Eventually Java might get such capabilities if Panama and Valhalla actually end up being part of the official implementation.

Manual memory management is required for some critical code paths, but so is Assembly, both are niches, not something to spend 100% of our coding hours.

tracker16y ago

GC is an inherent part of many higher level languages/runtimes. Including Go, which is the main reference here, but also in .Net and Java runtimes. Yes, you could use a lower level language like C/C++, D, Rust etc to work around the issue in other ways, but that leaves a lot of productivity benefit higher level languages bring to the table.

I can't read the referenced twitch article from work so cannot comment. I'm also not sure of the practical loads and implementation details and am surprised that the Go GC was generally an issue to begin with.

I know I've purposely called GC for languages that use it for ETL jobs that run on shared servers to minimize memory usage before.

ncmncm6y ago

They are "obligate-GC" languages. You don't get a choice whether to rely on it.

It is fundamentally misleading to call C++ or Rust "lower-level" languages than Go or Java. (As it is, also, to say "C/C++".) Both Rust and C++ support much more powerful abstractions than either, making them markedly higher-level. That they also enable actually coding abstractions to manage resources (incidentally including memory resources) reduces neither their expressiveness nor the productivity of skilled programmers.

The point of Java and Go is that less-skilled programmers can use them to solve simpler problems more cheaply. Since most problems are simple, those languages have a secure place.

2 more replies

weberc26y ago

There are other factors in engineering besides CPU. For many applications, CPU is the cheaper resource (e.g., compared to developer time / opportunity cost).

ijcd6y ago· 3 in thread

We had internal discussions around the "hacky" nature of the solution. Both sides had proponents. The proof was in the numbers and some teams utilized it, others did not. In the end it was a few-line solution that solved the problem neatly and didn't rely on a (terrifying) dynamic solution such as is proposed in this blog post. It was expected the "hack" would be temporary as we expected the Go GC to quickly improve to the point it was not necessary.

panpanna6y ago

Since you seemed to have analyzed this carefully, why couldn't object pools be used to reduce collectable garbage in the first place?

ijcd6y ago

That was done too, of course. There’s so much to get done in these big systems that it’s often most efficient to take the quick win and move on, especially when, as I mentioned, the world is expected to fix the problem for you for free.

dilyevsky6y ago

Ikr? “ballast is hacky! let me just build my own gc real quick”. Fwiw i think go 1.14 will have the required knob in the runtime package.

_bxg16y ago· 3 in thread

This is not my area of expertise, but from a software engineering perspective, the proposal "Replace a constant in a configuration file with a new piece of procedural code" smells like a huge new liability when it comes to maintenance. Of course it could be truly necessary, but the author made it sound like the "ballast" method was working just fine and simply felt hacky. Personally, I'd rather document and maintain a single value change that's "hacky" than 22 extra lines of turing-complete code.

suresk6y ago

I think I can see both sides of this argument - the "ballast" method is hacky not just because of it being a sort of magic thing that might be tricky to remember later, but it is relying on undocumented behavior that is not part of the contract Go provides and could randomly break later.

The method presented in the article does seem better in that it is using well-known and documented parts of Go's runtime api, but I think it might be problematic for other reasons. Fiddling with GC behavior is always a little risky because it works fine until you hit some weird corner case and it blows up.

For example - What happens if that goroutine doesn't run for longer than you expect and you leave GC turned off while another goroutine is creating a ton of garbage? Might never be a problem, but it depends on allocation behavior and how much headroom you have.

So it feels more correct, but also seems like it requires a lot more tuning and testing to feel confident about it.

_bxg16y ago

> it is relying on undocumented behavior that is not part of the contract Go provides and could randomly break later

Sort of. A change in the undocumented behavior might cause you to lose your fine-tuning at some point in the future, but I wouldn't say it'll ever cause it to break. You're just telling Go how much memory you want to pre-allocate. It'll continue doing that; if that stops getting you the same GC benefits you wanted, then at worst you'll be back in the same boat you were originally.

Writing your own GC routine, on the other hand, gives you a ton of new opportunities for introducing very real breakage via your own code.

lilyball6y ago

Agreed, especially because this new code may have unintended consequences. For example, if the heap grows extremely fast in that 500ms sleep time then it can get dramatically larger than you'd like, when instead we want to run a GC right as it hits 20GB used.

arcticbull6y ago· 2 in thread

This reminds me of why I hate garbage collectors and think we shouldn't keep investing in them. Instead, we should double down on languages that allow you to express liveness constraints in a way the compiler can understand and manage statically. I'm not saying we have the perfect one yet, though continuing to add knobs to a gooey ball of complexity is at best a game of whack-a-mole. Do something you haven't planned for and your whole app or service takes a dirt-nap and you need to call in a crack squad of your most senior engineers. Then what? Uh, maybe allocate 11GB? There's no predictability -- or even causality -- to these optimizations.

There's enough rockets on the rocket-powered horse that is GC to make it to the moon and back.

pjmlp6y ago

What we should do is learn that many GC enabled languages also offer other means to manage resources, and increase adoption of such features, instead of throwing the baby with baby water, just because a couple of them use GC for everything.

arcticbull6y ago

GC is a means not an ends and we shouldn't be attached to it. We should focus on developing languages that allow the compiler to statically assess and infer lifetimes then we don't need a giant for loop over all of active memory. The value GC provides is it gives the developer an escape-rope from an insufficiently expressive language. If that solution involves some form of GC so be it, but the goal should not be to preserve GC but rather to improve the efficiency of the final product without substantially impacting developer ergonomics.

1 more reply

teej6y ago· 1 in thread

This has interesting parallels to the issues folks have with autoscaling in AWS. When people first start using autoscaling it can be frustrating finding the right heuristic to scale on, with the automated system over-shooting or under-shooting what the capacity needs are.

What works well is when you calculate your own capacity needs, then just set the autoscaler to change to that new capacity number. In other words, using your knowledge of how your system works, you'll make better decisions than just looking at secondary metrics like resource utilization.

I know I've done manually triggered GC in Ruby and Java but I don't know enough about Go to say if the article's suggestion is reasonable.

ec1096856y ago

What does that mean to calculate capacity needs of the application? Are you saying base it on something like throughput your app can handle?

MapleWalnut6y ago· 1 in thread

off topic: It's annoying how the Twitch blog linked in the article doesn't have an RSS feed. How do people read these blogs without one?

https://blog.twitch.tv/en/tags/engineering

cyrusaf6y ago

The blog post is also available on Medium: https://medium.com/twitch-news/go-memory-ballast-how-i-learn...

MaulingMonkey6y ago

Allocating 10GB per the original twitch blog, to tune GC specifics, does feel like a bit of a hack around missing API knobs - but it's elegant enough. The fact that it relies on the specifics of the underlying GC, when trying to tune the specifics of GC behavior, isn't much of a drawback, so much as it's just sane performance tuning.

This proposed alternative of just toggling the GC on/off outright in a sleeping loop feels like a pretty big sledgehammer - and just as much of a hack. The 500ms sleeps are enough to see 5 GC cycles, going off of the original twitch blog's 10 GCs/second numbers, which would also concern me - as a potentially unwanted latency spike. I'm also curious what happens when the GC is toggled back off mid-GC. It's more code, and feels brittle. That ReadMemStats sync point may be worse than the GC spam in the first place!

twotwotwo6y ago

There is internal work on a SetMaxHeap API: https://github.com/golang/go/issues/23044 (there's a review of related code at https://groups.google.com/forum/#!topic/golang-codereviews/b... ). It isn't perfect (notably, heap size and process size as seen by the OS are not identical) but seems like a step up from ballast or other workarounds.

In the issue thread Caleb Spare also proposed a minimum heap size so that you get GOGC-ish behavior once your app uses enough RAM, but don't have constant GCs with a tiny heap.

There's definitely a common issue where the GOGC heuristic doesn't take advantage of situations where it can collect less often but still remain in the "don't care" range of memory use. (CloudFlare talked about the same thing making benchmark results weird: https://blog.cloudflare.com/go-dont-collect-my-garbage/ )

And there can definitely be situations where GC'ing a bit more would be worth it to keep a process under an important memory threshold to avoid swapping or OOM kills.

The designers famously don't want too many knobs, but some other ways to convey user priorities to the runtime could certainly save users from some awkward workarounds and fiddling w/the existing knobs.

EdwardDiego6y ago

I presume this is why Java has quite specific initial/min/max heap parameters, definable either as a set amount of RAM, or a percentage of available.

erik_landerholm6y ago

If the ballast works and they can "afford" it within whatever parameters they are using to define "afford", I'm all for that method.

tom_mellior6y ago

Interesting approach to the application defining a custom GC strategy. (I wonder why the author gave it this strange title, since the article is really about something that Twitch is not doing.)

I'll save this for the next time someone posts something along the lines of "you can't program X in a GC'd language because the GC is so unpredictable".

ncmncm6y ago

> the ballast concept seems like a quite hacky solution to me.

"Quite a hacky solution" describes every single detail of every scrap of code connected in any way to GC. It is the whole point of the enterprise. If hacky solutions make you unhappy, your only route to happiness is to run very far away.

A lot gets done with very hacky solutions, and you will never need to throw a rock very far to hit somebody who swears by them. Those of us who don't haven't time to get that work done, so for most of the world's work, it's hacks or nothing.

marcrosoft6y ago

It is distracting to read this article and see Go code not ran through gofmt.

j / k navigate · click thread line to collapse

43 comments

32 comments · 14 top-level

dwohnitmok6y ago· 4 in thread

The trade-off of the knob-less approach of Go's GC I suppose.

apta6y ago

The JVM offers state of the art GCs, which allows for selecting the best tool for the job (throughput, latency, large heaps, etc.).

This is unlike the golang gc, which is tuned for latency at the expense of throughput, with no way of modifying its behavior without resorting to hacks like the article in the post.

dwohnitmok6y ago

2 more replies

tus886y ago

Except when the best tool for the job is a language without GC at all.

Thaxll6y ago

Yes you can modify GC behaviour and it's one env variable, the fact that twitch didn't use it makes no sense.

1 more reply

zozbot2346y ago· 4 in thread

pjmlp6y ago

Because GCs offer the best balance between performance and productive, even when going with the reference counting algorithms path.

Except for the heroic efforts from the Rust community, linear types are far from general consumption for any kind of software development.

Plus, having GC does not preclude being able to stack allocate, keep data on manual memory segment, or even resort to manually manage memory in unsafe code blocks.

Examples of GC enabled languages with such features, Modula-3, Mesa/Cedar, Active Oberon, Nim, D, Eiffel, C#, F#, System C# (M#), Sing#, Swift, ParaSail, Chapel.

Eventually Java might get such capabilities if Panama and Valhalla actually end up being part of the official implementation.

Manual memory management is required for some critical code paths, but so is Assembly, both are niches, not something to spend 100% of our coding hours.

tracker16y ago

I know I've purposely called GC for languages that use it for ETL jobs that run on shared servers to minimize memory usage before.

ncmncm6y ago

They are "obligate-GC" languages. You don't get a choice whether to rely on it.

The point of Java and Go is that less-skilled programmers can use them to solve simpler problems more cheaply. Since most problems are simple, those languages have a secure place.

2 more replies

weberc26y ago

There are other factors in engineering besides CPU. For many applications, CPU is the cheaper resource (e.g., compared to developer time / opportunity cost).

ijcd6y ago· 3 in thread

panpanna6y ago

Since you seemed to have analyzed this carefully, why couldn't object pools be used to reduce collectable garbage in the first place?

ijcd6y ago

dilyevsky6y ago

Ikr? “ballast is hacky! let me just build my own gc real quick”. Fwiw i think go 1.14 will have the required knob in the runtime package.

_bxg16y ago· 3 in thread

suresk6y ago

So it feels more correct, but also seems like it requires a lot more tuning and testing to feel confident about it.

_bxg16y ago

> it is relying on undocumented behavior that is not part of the contract Go provides and could randomly break later

Writing your own GC routine, on the other hand, gives you a ton of new opportunities for introducing very real breakage via your own code.

lilyball6y ago

arcticbull6y ago· 2 in thread

There's enough rockets on the rocket-powered horse that is GC to make it to the moon and back.

pjmlp6y ago

arcticbull6y ago

1 more reply

teej6y ago· 1 in thread

I know I've done manually triggered GC in Ruby and Java but I don't know enough about Go to say if the article's suggestion is reasonable.

ec1096856y ago

What does that mean to calculate capacity needs of the application? Are you saying base it on something like throughput your app can handle?

MapleWalnut6y ago· 1 in thread

off topic: It's annoying how the Twitch blog linked in the article doesn't have an RSS feed. How do people read these blogs without one?

https://blog.twitch.tv/en/tags/engineering

cyrusaf6y ago

The blog post is also available on Medium: https://medium.com/twitch-news/go-memory-ballast-how-i-learn...

MaulingMonkey6y ago

twotwotwo6y ago

In the issue thread Caleb Spare also proposed a minimum heap size so that you get GOGC-ish behavior once your app uses enough RAM, but don't have constant GCs with a tiny heap.

And there can definitely be situations where GC'ing a bit more would be worth it to keep a process under an important memory threshold to avoid swapping or OOM kills.

EdwardDiego6y ago

I presume this is why Java has quite specific initial/min/max heap parameters, definable either as a set amount of RAM, or a percentage of available.

erik_landerholm6y ago

If the ballast works and they can "afford" it within whatever parameters they are using to define "afford", I'm all for that method.

tom_mellior6y ago

Interesting approach to the application defining a custom GC strategy. (I wonder why the author gave it this strange title, since the article is really about something that Twitch is not doing.)

I'll save this for the next time someone posts something along the lines of "you can't program X in a GC'd language because the GC is so unpredictable".

ncmncm6y ago

> the ballast concept seems like a quite hacky solution to me.

marcrosoft6y ago

It is distracting to read this article and see Go code not ran through gofmt.

j / k navigate · click thread line to collapse