A surprising JavaScript memory leak found at Meteor (opens in new tab)

(point.davidglasser.net)

179 pointsglasser13y ago104 comments

104 comments

74 comments · 21 top-level

RyanZAG13y ago· 10 in thread

That this kind of issue exists in javascript should come as a surprise to no one - there are probably a huge number of these kind of issues just waiting to be found. It's actually the key difference between a 'designed' language and an 'evolved' language.

EDIT: I find the down-votes very amusing. Has javascript now developed a cult? Read up on the history of javascript if my message above comes as a shock to you.

crazygringo13y ago

I think the down-votes are because you're attacking JavaScript without basis, and not constructively. This "memory leak" has nothing to do with the distinction between languages you make. And indeed, this kind of behavior in closures is by design (as I understand it), and not by evolution at all.

RyanZAG13y ago

Fair point, but I disagree with you - fulled specced out designed languages generally specify how exactly the implementation should handle these cases. How references should be retained, when references fall out of scope, etc.

eg:

http://clang.llvm.org/docs/Block-ABI-Apple.html

http://download.oracle.com/otndocs/jcp/lambda-0_5_1-edr2-spe...

http://www.microsoft.com/en-us/download/details.aspx?id=7029

etc etc. I also don't believe I'm attacking javascript here, merely repeating what people have always known about evolved languages. Evolved languages have a lot of pros (mainly speed of new features and practical application), and they have cons too. What I'm saying is very simple: this type of bug in javascript should not be a surprise - it should be expected. There will be more.

EDIT: Laughably enough, while the ObjC and Java specifications both specifically address this issue, the C# specification actually formalizes this exact bug into the language specification. So I concede your point, C# - a designed language - simply designed the exact same bug into the language itself. See section 7.15.5.1 Captured outer variables.

Feel free to downvote this post - I now concede that designed languages are just as likely to design stupid bugs into the language as evolved languages are to create them by mistake. ;)

2 more replies

viraptor13y ago

It's a bug in a complex element (GC). They happen. I wouldn't attribute it to a specific language, since even this post only mentions the latest Chrome.

There's a small reproducer example, so there will be a new version issued shortly. Not really something to get excited about in my opinion.

(Yes, I know this behaviour could be actually called ok and allowed, but you can do analysis on that code that will show the variable is not used. Even if not now, it can be fixed at some point in the future.)

((Downvoted as language bashing, rather than because I like JS - I hate it with passion for other reasons. "there are probably a huge number of these kind of issues just waiting to be found" is just a silly claim unless you're actually working on finding them or a true claim for any kind of general purpose software.))

RyanZAG13y ago

I believe you are correct, my misconception regarding the disparities between designed and evolved languages that was nicely cleaned up for me by reading the specification for the designed C# language. Down-vote for my stupid comment appreciated, it got me to actually read the thing.

nknighthb13y ago

I hate JavaScript myself, but you're blaming it for what is actually an implementation problem, not a property of the language itself. It's also a problem I could see occurring in the implementation of any other language that supports closures.

pcwalton13y ago

This is a V8 problem, not a JavaScript problem. Any implementation of any garbage-collected language with higher-order functions could have this issue.

wtetzner13y ago

Actually, in this case it's not the design of JavaScript that's the problem. It's V8's implementation of closures. The two functions are sharing the same environment object, but there's nothing in JavaScript the language that dictates this implementation.

lquist13y ago

I wonder if this will temper the rise of the server-side JS frameworks?

RyanZAG13y ago

Not likely - these kind of bugs are generally fixed when found, or remain hidden simply because that part of the language is very rarely used. You'll likely see this bug in Chrome fixed within a few months.

I was commenting more on the practical effects of having an evolved language - this is a known and likely outcome in an evolved language. Designed languages generally spend a lot of time trying to avoid these kind of issues - it makes sense that not spending that time and adding in features as they are thought up means issues like this will creep in somewhere.

1 more reply

PommeDeTerre13y ago

This problem is quite minor and obscure compared to many of the other inherent flaws (type coercion and the broken comparison operators, the lack of proper namespacing, broken scoping, semicolon insertion, hoisting, prototype-based OO, and so on) we see all throughout JavaScript.

If those problems weren't enough to convince certain developers that JavaScript is a bad idea, and unsuitable for use, then it's likely that nothing will. This is especially true for server-side usage of JavaScript, where there are so many far better alternatives.

mayank13y ago· 9 in thread

TL;DR -- all closures created within a function are retained even if only one closure is ever referenced. This is really interesting, and potentially devastating for certain coding styles.

I ran the code on node and the RSS does appear to be increasing without bound. Even running node with --expose-gc and forcing a global.gc() inside logIt() causes the same unbounded memory growth.

Increasing the size of the array by a factor of 10 causes RSS usage to jump up by a factor of 10 every second, so we know that the memory usage isn't caused by creating a new long-lived closure (i.e., the logIt() function) every second.

In fact, removing the call to doSomethingWithStr() doesn't change the unbounded memory growth.

Here's a shorter snippet that demonstrates the leak more dramatically (about 10 MB/sec RSS growth):

  var run = function () {
    var str = new Array(10000000).join('*');
    var fn  = function() {
      str;
    }
    var fn = function() { }
    setInterval(fn, 100);
  };
  setInterval(run, 1000);

Tried it out on node v0.8.18

glasserOP13y ago

To be clear: when you say "removing the call to doSomethingWithStr() doesn't change the unbounded memory growth" you do literally mean "removing the call" and not "removing the function definition", right? That matches what I see.

mayank13y ago

Correct. I'm actually a little surprised that an identity statement like "str;" isn't optimized out.

Edit: removing the function definition eliminates the problem, so I'd say that your diagnosis of the problem is spot on.

richardofyork13y ago

>> TL;DR -- all closures created within a function are retained even if only one closure is ever referenced. This is really interesting, and potentially devastating for certain coding styles. <<

This is not devastating, it is how closures and scope chain work in JavaScript. See my detailed explanation below. The crux of the matter is that the closure still references the outer function's scope chain, even after the outer function has returned. And the outer functions's scope chain is referenced by the closure until the closure is destroyed or returns.

reeses13y ago

I'm a very lazy (oops) language implementer. Every time but one when I've implemented closures, I took the quick approach of just copying the entire frame to the heap.

The other time, I was able to do more data flow analysis, but it also resulted in a bunch of annoying fiddly bugs and took more maintenance.

I'm not suggesting the v8 team took the 'easy way' out, but doing the deep introspection is hard, and in an environment such as a browser, I can see trading a pathological case such as this for what must be 1,000,000,000 inappropriate aggressive gc bugs.

(See the recent discussion over RubyMotion for examples of the opposite problem: http://news.ycombinator.com/item?id=5949072)

bzbarsky13y ago

Note that the TL;DR may well be V8-specific; other engines optimize closures differently and may not have the same action-at-a-distance interaction between closures.

But the real killer is that you can't tell which closures will keep what alive unless you assume that every closure keeps everything it closes over (even if it doesn't use it) alive. Nothing in the ES spec requires them not to...

sampk13y ago

What is "RSS"?

tantalor13y ago

After a nontrivial degree of Googling,

http://en.wikipedia.org/wiki/Resident_set_size

sampk13y ago

It was an honest question. Thanks for the downvote moron, very helpful.

casual_slacker13y ago

I'm very confused by this whole thread.

Is it that, only one closure is created for variables defined in the function run, and variables needed in any inner-function are added to run's closed context? That seems like a bug fixable without impacting the language.

crazygringo13y ago· 8 in thread

Honestly, I'm not really sure I'd call this a "memory leak" -- I just always assumed that all the variables present in a closure are maintained, if there's an existing reference to any function defined within. I'm not surprised at all, this is what I would expect.

I'm actually suprised/impressed that Chrome has an optimization to detect which closure variables are actually used in a function and garbage-collect the rest.

But it's certainly something important to be aware of. I think the main point is, why on earth would you be constantly re-defining a function "logIt" within a function that is repeatedly called? That's just bad programming -- whenever you define functions within functions in JavaScript, you really need to know exactly what you're doing.

tolmasky13y ago

> why on earth would you be constantly re-defining a function "logIt" within a function that is repeatedly called? That's just bad programming -- whenever you define functions within functions in JavaScript, you really need to know exactly what you're doing.

It's just a reduction to demonstrate the problem. I'm sure the code that actually triggered it was much more complex and had valid reasons for doing whatever it was doing.

crazygringo13y ago

I'm sure it was more complex, of course.

But what I mean is -- whenever you define a function within another function, that should never be a "default" way of writing JavaScript, because it creates closures, and you're just asking for "memory leaks".

Because closures use memory, you need to make sure there's a valid reason for creating the closure, and above all that if you're creating multiple closures by calling the parent function repeatedly, that there's a really really good reason for it, and that the closures are able to be garbage-collected later on.

2 more replies

Dylan1680713y ago

> I just always assumed that all the variables present in a closure are maintained, if there's an existing reference to any function defined within.

In the logical sense each function has its own closure. There is no way functions should affect which[edit for clarity] variables are closed on by each other. So you shouldn't expect this.

>I'm actually suprised/impressed that Chrome has an optimization to detect which closure variables are actually used in a function and garbage-collect the rest.

In the logical sense functions close over their variables, not all variables that happen to be in scope. So that optimization is needed.

crazygringo13y ago

If I'm understanding what you're saying, that's just not correct. Try and run this code:

    var a = function() {
      var z = "a";
      window.b = function() { z = "b"; }
      window.c = function() { return z; }
    };
    a();
    window.c(); // returns "a"
    window.b();
    window.c(); // returns "b"

As you can clearly see, the single closure containing the variable "z" is shared between functions b() and c().

So each function does not have its own closure -- closures work "upwards", referencing everything in every scope above them.

1 more reply

jcampbell113y ago

> In the logical sense each function has its own closure. There is no way functions should affect the variables closed on by each other. So you shouldn't expect this.

What? The whole point of a closure is that it is shared.

1 more reply

ulisesrmzroche13y ago

In JS, functions can refer to variables defined in outer scopes, and can refer to variables defined in outer functions even after those functions have returned. JS functions are first-class objects.

They also store any variables they may refer to that are defined in their enclosing scopes, including the parameters and variables of outer functions.

bzbarsky13y ago

It's a common enough optimization for JS engines in some form.

In particular, if the closure only reads the variable and nothing can write it, it's way cheaper in terms of performance to just create a separate lexical environment for the closure that contains the things it reads so that you don't have to keep walking up the scope chain on bareword lookups in the closure.

The fact that things then end up not referenced and can be gced is just a side-effect of the primary goal of the optimization: faster access to variables in a closure.

lesslaw13y ago

Works as per spec, won't fix. Closing.

The definition of "memory leak" is smeared all across the road.

ChuckMcM13y ago· 5 in thread

An excellent catch. The parse tree for the closure should be able to ascertain the reachability of the variables in scope so that fences around particular sets of variables can be established. To do that would require something a bit more sophisticated than a single taint id though. You would really want a taint 'flavor' such that for any closure and in variable in scope of that closure you could defined f(closure, var) which would return turn if that variable cannot be collected. I can't think off the top of my head how you would recycle identifiers without risking temporal tainting (where a new closure in scope with the same id came at a later time and double tainted the variable leaving it essentially uncollectable or collected early. Hmm, or maybe not since their address on the heap will be different you're probably ok with that.

FWIW that is certainly the twisty bits of garbage collected languages.

pcwalton13y ago

> The parse tree for the closure should be able to ascertain the reachability of the variables in scope so that fences around particular sets of variables can be established.

Sadly it's not quite that simple because of the behavior of `eval` (strictly speaking, direct `eval`): it can leak bindings into the surrounding scope.

jcampbell113y ago

The eval case is already being handled, where the entire local scope automatically captured if there is an eval.

It seems like ChuckMcM is describing is reference counting which variables are needed in the closure. His suggestion would fix this memory issue, but probably have a big performance penalty.

glasserOP13y ago

Right, but according to sections 10.4.2 and 15.1.2.1.1 of the ECMAScript standard, you only get lexical bindings inside your `eval` if it's literally a call to a thing called `eval`. (Which I believe is what you meant by "direct `eval`".)

So you should be able to statically determine if this is the case, and it's not here.

j_s13y ago

Nice! You've explained why this 'memory leak' could be by design - blame eval.

In fact, Chrome GC-ing unreferenced variables could break eval if they didn't cover that case.

2 more replies

ChuckMcM13y ago

ewww, yuck! Absolutely right about that. So you need to plug the tainting into the code generation perhaps? Suggests that both a reachability test (can the code ever be reached) combined with a taint test (code that can be reached and has this variable in scope). Then there is code that could exist if instantiated but doesn't yet.

33a13y ago· 5 in thread

That is surprising? A setInterval without a matching clearInterval is always a memory leak.

glasserOP13y ago

The surprise isn't that the memory associated with `logIt` is leaked.

The surprise is that `logIt` holds on to the giant `str` object in its environment, despite the fact that there is literally no way for the `str` variable to ever be used again.

V8 is smart enough to not make `logIt` hold on to `str` if there are no closures at all which refer to `str`. It's the introduction of the unrelated `doSomethingWithStr` closure that forces `str` into the lexical environment.

glasserOP13y ago

BTW, the original bug in the Meteor codebase wasn't a setInterval.

It was that code to replace a certain type of object (a Spark renderer) with a new instance of that object accidentally ended up with a reference to the preceding renderer in a closure assigned somewhere on it, even though that particular closure didn't actually use that reference.

So instead of replacing renderer #N with renderer #(N+1) and GCing #N, we ended up with a stream of renderers which never could be GCed... even though the reference keeping the old ones alive was literally impossible to ever use.

byroot13y ago

No it's not.

  setInteval(function() { console.log('foo'); }, 1000)

Is not a memory leak.

But I agree that it's not surprising at all. If you can access a variable from somewhere it will obviously not be garbage collected.

mayank13y ago

The point is that you can't access str from elsewhere, which is why it is surprising.

1 more reply

33a13y ago

That allocates a closure. Now if you want it to sit around forever, then hey it is working as intended. Otherwise, it does "leak" some memory that will never get reclaimed by the GC.

glasserOP13y ago· 3 in thread

I don't know how to look at memory usage in other modern JS interpreters like those used in FireFox and Safari. Anyone able to check to see if they have this problem? (And if they manage to avoid the memory leaks in the first two code samples?)

shardling13y ago

In Firefox you could always do it manually with about:memory -- 1MB a second should be easy enough to notice. Not sure they've added a way to get a pretty graph yet, though. Probably Firebug has a way?

I wonder if using the "use strict" directive would let the browser optimize this more easily? Probably not, but Javascript does have a lot of crazy corner cases, and one of them might be preventing (or just making it harder to prove safe) the optimization.

e: Here's an example of one of those corner cases:

    console.log = eval

Now whether those functions reference a particular variable depends on the string they're printing! And this swap out of log() could be done at any time.

e2: Ah, as glasser and pcwalton point out, an indirect reference to eval doesn't work the same way as a direct call. TIL!

glasserOP13y ago

No, actually, you literally have to call eval as a function called eval if you want to get the local lexicals. See sections 10.4.2 and 15.1.2.1.1 of the ECMAScript standard, or try it out:

   > (function () { var x = 5; eval("console.log(x)"); })()
   5
   > (function () { var x = 5; var e = eval; e("console.log(x)"); })()
   ReferenceError: x is not defined

masklinn13y ago

I haven't checked the first two samples but:

* Safari 5.1 exhibits no significant heap growth, as far as the timeline shows anyway

* Using top to observe private memory, Firefox does seem experience significant heap growth on the provided program

richardofyork13y ago· 2 in thread

First, it is not a bug in JavaScript at all.

Here is a technical explanation of why there is a memory leak and how to fix the problem.

The scope chain of Closures (in JavaScript) contains the outer function(s) activation object. The activation object of a function contains all the variables that the function has access to, and it is part of a function’s scope chain.

This means the inner function (the closure) has access (a reference) to the outer function’s scope chain, including the global object. And even after the outer function has returned, the closure still has access to the outer function’s variables.

Therefore, the activation object of the outer function cannot be destroyed (for garbage collection) after it has returned, because the closure still references its variables.

When the outer function returns, its own scope chain for execution (its execution object) is destroyed, but its activation object is still referenced by the closure, so its variables will not be destroyed until the closure is destroyed.

The execution context of a function is associated with the functions’ activation object, but while the execution object is used for its own execution and is destroyed when it returns, its activation object is referenced by closures—its inner functions.  

Now, as to the specific example in question: The reason the str variable is never destroyed is because it is referenced buy the logIt function because the logIt function's execution object references the entire scope chain of the run function, and the logIt function is never destroyed, so the str variable remains in memory.

As the original author (OP) suggested, be sure to dereference any local variable in the outer function that the closure is using, once the closure is done with it or once the outer function is done with it.

Also, simply setting the logIt function to null (when it completes execution—returns) will allow the str variable and the entire scope of chain of both the logIt and the containing run function to be destroyed and ready for garbage collection. 

For a detailed explanation of closures in JavaScript, see "Understand JavaScript Closures With Ease": http://javascriptissexy.com/understand-javascript-closures-w...

nknighthb13y ago

You seem to be describing behavior that the article did not. Specifically, the retention of 'str' because of the 'logIt' function.

If you read more closely, you'll notice that logIt was not the cause of 'str's retention, but instead, the doSomethingWithStr function was.

When logIt is present by itself, str is not retained. Only when doSomethingWithStr is added alongside logIt is str retained.

richardofyork13y ago

nknighthb, I am just wondering. Did you downvote me because you think one part of my explanation was not specific enough?

First, you are correct that I specifically mentioned the logIf function when I discuss the specific example.But that does not take away from my thorough explanation of the main reason for the problem. In fact, everything I said about the logIt function applies to the doSomethingWithStr function, since they are both closures, so my explanation stands as is.

If you read my explanation again, you will see that I clearly explained that closures still have access to the outer function's scope, so both the logIt and the doSomethingWithStr functions have access to the outer function's scope chain even after the outer function or any of the other closures returns.

It is not until both all closures are destroyed or returns that the outer function's scope activation object is destroyed.

1 more reply

wingspan13y ago· 2 in thread

The C# compiler does the same thing:

http://blogs.msdn.com/b/ericlippert/archive/2007/06/06/fyi-c...

This is an old article, but I believe it still applies to the latest iteration of the language.

RyanZAG13y ago

Does indeed still apply, and is actually formalized into the language spec itself [1] in section 7.15.5.1 Captured outer variables. So it's actually a feature (really) in the case of C#, not a bug.

[1] http://www.microsoft.com/en-us/download/details.aspx?id=7029

"the local variable x is captured by the anonymous function, and the lifetime of x is extended at least until the delegate returned from F becomes eligible for garbage collection (which doesn’t happen until the very end of the program)."

br113y ago

I don't see how that part of standard mandates that all anomymous functions share the same closure. The example only has a single lambda.

Tloewald13y ago· 2 in thread

It's more of a surprising non-leak caused by clever GC that is insufficiently clever to handle pathological code. Interesting explanation, but yet another reason to be very careful of setInterval.

ufo13y ago

Not only setInterval though. This sort of leak can also show up if the closure is a DOM event handler or is otherwise exposed.

Tloewald12y ago

It can show up anywhere — and in fact event handlers are a major source of leaks in general — but the point is that the leak doesn't occur as frequently as you might expect not that it occurs.

bsimpson13y ago· 2 in thread

I don't know how you can safely declare what's used vs unused in a language where properties can be dynamically accessed.

It's easy to check for dict.someProperty, but dict[propertyName] where propertyName can be the string "someProperty" is a much more complicated problem. Now you have to build a tree of everything that changes propertyName and make sure you know it can never be "someProperty".

masklinn13y ago

You can't dynamically access scope objects in standard javascript[0] except by using `eval`, and `eval` already triggers special handling and deoptimizations in pretty much all browsers.

[0] outside of the global scope through `window` but all variables are clearly local here so that doesn't apply.

bsimpson13y ago

Doh! Good point.

btilly13y ago· 2 in thread

If we implement the suggested fix, and then someone evals code in that scope, they can get the surprising result that variables which look like they should be in scope actually aren't. Careful analysis of this will have many edge cases to worry about.

Implementing the simplest thing that is correct according to the spec sounds like a good idea to me.

glasserOP13y ago

Yes, but you can statically determine whether or not eval is called.

A world-class JavaScript environment like V8 is far past "the simplest thing that is correct according to the spec".

In fact, the JavaScript spec doesn't actually define anything about garbage collection! You can create a compliant ECMAScript-262 runtime that literally never collects any garbage ever. That's certainly the simplest correct thing. But it's a bad idea!

V8 already goes to the trouble of figuring out whether or not a given variable needs to be stored in the lexical environment or not. Specifically: variables that are not used in ANY closures and where there is no eval in sight can be stored outside of the lexical environment. This is great, and better than many similar programming language environments offer.

It just would be even better if they went one step farther.

btilly13y ago

V8 attempts to be fast. There is a trade-off between sophisticated analysis during compilation, and page performance.

I'm not as convinced as you that they haven't found the right balance.

mAritz13y ago· 1 in thread

The fix in https://github.com/meteor/meteor/commit/49e9813 seems to be just limiting the effect of the problem, not removing it.

The variable is still being held in memory just with the value null instead of whatever it was before.

Or does setting it to null trigger some special GC algorithm that detects that it isn't being used anymore?

mistercow13y ago

There's nothing particularly special about null - it's really just that it's being set equal to any constant (or object that already exists for the life of the program). The object that was allocated and previously held in renderer will be GC'd (nothing special about that - it doesn't have any references anymore), and setting to null doesn't create a new allocation. You still have "renderer" as an entry in the dictionary of variables for the closure (which takes up a small amount of memory, presumably), but it's no longer preventing an object from being deallocated.

gsg13y ago· 1 in thread

This problem has been encountered and studied before: see Appel and Zhao's paper, Efficient and Safe-for-Space Closure Conversion.

glasserOP12y ago

Cool! A major reason I made this post was because I assumed that this must be something people had seen before, but I had trouble finding any references to it in the JavaScript context. I'll add a link to http://flint.cs.yale.edu/flint/publications/escc.pdf to the blog post!

dreamdu5t13y ago· 1 in thread

I'm surprised this comes as a surprise to Meteor developers. References retained won't be garbage collected... regardless of whether the lexical scope is "really small" or not, lol.

wtbob13y ago

The surprise is that a non-referenced value isn't collected, not that a referred-to value isn't.

glasserOP13y ago

OP here. Several people have pointed out that of course I should expect a leak, since each `logIt` object is leaked. That's absolutely true; my point was just that we don't want `str` to leak.

But the original bug that led to this discovery involved a data structure that shouldn't have leaked at all. I've updated the post to show it; duplicated here since GitHub Pages seems to cache posts pretty aggressively.

    var theThing = null;
    var replaceThing = function () {
      var originalThing = theThing;
      // Define a closure that references originalThing but doesn't ever actually
      // get called. But because this closure exists, originalThing will be in the
      // lexical environment for all closures defined in replaceThing, instead of
      // being optimized out of it. If you remove this function, there is no leak.
      var unused = function () {
        if (originalThing)
          console.log("hi");
      };
      theThing = {
        longStr: new Array(1000000).join('*'),
        // While originalThing is theoretically accessible by this function, it
        // obviously doesn't use it. But because originalThing is part of the
        // lexical environment, someMethod will hold a reference to originalThing,
        // and so even though we are replacing theThing with something that has no
        // effective way to reference the old value of theThing, the old value
        // will never get cleaned up!
        someMethod: function () {}
      };
      // If you add `originalThing = null` here, there is no leak.
    };
    setInterval(replaceThing, 1000);

btipling13y ago

Nice find. I've run into some inexplicable garbage collection issues where canvas elements in an early version of flot ended up never getting released. I could figure out no way to clear these, but I remember the flot code made heavy use of closures. I should go revisit the code and see if this might explain it. I ended up simply hanging on to the elements and reusing them, had to patch flot to achieve this.

michaelwww13y ago

This article taught me how to find a major leak in my Dart2js code so thanks John McCutchan & Loreena Lee http://www.html5rocks.com/en/tutorials/memory/effectivemanag...

glasserOP13y ago

Looks like Go optimizes this fully, as pointed out by https://twitter.com/nynexrepublic/status/350717895971586049

If you run these on your own machine and peek at the RSIZE, it stays constant (well, the first grows slowly due to `logIt`).

http://play.golang.org/p/A5Pz-3kthP http://play.golang.org/p/RnXr_jB5Qh

finnw13y ago

>You could imagine a more clever implementation of lexical environments that avoids this problem. Each closure could have a dictionary containing only the variables which it actually reads and writes; the values in that dictionary would themselves be mutable cells that could be shared among the lexical environments of multiple closures.

That is basically how Lua implements closures.

https://bugzilla.mozilla.org/show_bug.cgi?id=542074

doctorpangloss13y ago

Kudos to the excellently talented Meteor team. A great product from a genuine customer (of a free product), who feels great that there is such attention to detail.

oakaz13y ago

The example code looks really silly. As others said, it's just bad programming.

j / k navigate · click thread line to collapse

104 comments

74 comments · 21 top-level

RyanZAG13y ago· 10 in thread

EDIT: I find the down-votes very amusing. Has javascript now developed a cult? Read up on the history of javascript if my message above comes as a shock to you.

crazygringo13y ago

RyanZAG13y ago

eg:

http://clang.llvm.org/docs/Block-ABI-Apple.html

http://download.oracle.com/otndocs/jcp/lambda-0_5_1-edr2-spe...

http://www.microsoft.com/en-us/download/details.aspx?id=7029

Feel free to downvote this post - I now concede that designed languages are just as likely to design stupid bugs into the language as evolved languages are to create them by mistake. ;)

2 more replies

viraptor13y ago

It's a bug in a complex element (GC). They happen. I wouldn't attribute it to a specific language, since even this post only mentions the latest Chrome.

There's a small reproducer example, so there will be a new version issued shortly. Not really something to get excited about in my opinion.

RyanZAG13y ago

nknighthb13y ago

pcwalton13y ago

This is a V8 problem, not a JavaScript problem. Any implementation of any garbage-collected language with higher-order functions could have this issue.

wtetzner13y ago

lquist13y ago

I wonder if this will temper the rise of the server-side JS frameworks?

RyanZAG13y ago

1 more reply

PommeDeTerre13y ago

mayank13y ago· 9 in thread

TL;DR -- all closures created within a function are retained even if only one closure is ever referenced. This is really interesting, and potentially devastating for certain coding styles.

I ran the code on node and the RSS does appear to be increasing without bound. Even running node with --expose-gc and forcing a global.gc() inside logIt() causes the same unbounded memory growth.

In fact, removing the call to doSomethingWithStr() doesn't change the unbounded memory growth.

Here's a shorter snippet that demonstrates the leak more dramatically (about 10 MB/sec RSS growth):

  var run = function () {
    var str = new Array(10000000).join('*');
    var fn  = function() {
      str;
    }
    var fn = function() { }
    setInterval(fn, 100);
  };
  setInterval(run, 1000);

Tried it out on node v0.8.18

glasserOP13y ago

mayank13y ago

Correct. I'm actually a little surprised that an identity statement like "str;" isn't optimized out.

Edit: removing the function definition eliminates the problem, so I'd say that your diagnosis of the problem is spot on.

richardofyork13y ago

>> TL;DR -- all closures created within a function are retained even if only one closure is ever referenced. This is really interesting, and potentially devastating for certain coding styles. <<

reeses13y ago

I'm a very lazy (oops) language implementer. Every time but one when I've implemented closures, I took the quick approach of just copying the entire frame to the heap.

The other time, I was able to do more data flow analysis, but it also resulted in a bunch of annoying fiddly bugs and took more maintenance.

(See the recent discussion over RubyMotion for examples of the opposite problem: http://news.ycombinator.com/item?id=5949072)

bzbarsky13y ago

Note that the TL;DR may well be V8-specific; other engines optimize closures differently and may not have the same action-at-a-distance interaction between closures.

sampk13y ago

What is "RSS"?

tantalor13y ago

After a nontrivial degree of Googling,

http://en.wikipedia.org/wiki/Resident_set_size

sampk13y ago

It was an honest question. Thanks for the downvote moron, very helpful.

casual_slacker13y ago

I'm very confused by this whole thread.

crazygringo13y ago· 8 in thread

I'm actually suprised/impressed that Chrome has an optimization to detect which closure variables are actually used in a function and garbage-collect the rest.

tolmasky13y ago

It's just a reduction to demonstrate the problem. I'm sure the code that actually triggered it was much more complex and had valid reasons for doing whatever it was doing.

crazygringo13y ago

I'm sure it was more complex, of course.

2 more replies

Dylan1680713y ago

> I just always assumed that all the variables present in a closure are maintained, if there's an existing reference to any function defined within.

In the logical sense each function has its own closure. There is no way functions should affect which[edit for clarity] variables are closed on by each other. So you shouldn't expect this.

>I'm actually suprised/impressed that Chrome has an optimization to detect which closure variables are actually used in a function and garbage-collect the rest.

In the logical sense functions close over their variables, not all variables that happen to be in scope. So that optimization is needed.

crazygringo13y ago

If I'm understanding what you're saying, that's just not correct. Try and run this code:

    var a = function() {
      var z = "a";
      window.b = function() { z = "b"; }
      window.c = function() { return z; }
    };
    a();
    window.c(); // returns "a"
    window.b();
    window.c(); // returns "b"

As you can clearly see, the single closure containing the variable "z" is shared between functions b() and c().

So each function does not have its own closure -- closures work "upwards", referencing everything in every scope above them.

1 more reply

jcampbell113y ago

> In the logical sense each function has its own closure. There is no way functions should affect the variables closed on by each other. So you shouldn't expect this.

What? The whole point of a closure is that it is shared.

1 more reply

ulisesrmzroche13y ago

In JS, functions can refer to variables defined in outer scopes, and can refer to variables defined in outer functions even after those functions have returned. JS functions are first-class objects.

They also store any variables they may refer to that are defined in their enclosing scopes, including the parameters and variables of outer functions.

bzbarsky13y ago

It's a common enough optimization for JS engines in some form.

The fact that things then end up not referenced and can be gced is just a side-effect of the primary goal of the optimization: faster access to variables in a closure.

lesslaw13y ago

Works as per spec, won't fix. Closing.

The definition of "memory leak" is smeared all across the road.

ChuckMcM13y ago· 5 in thread

FWIW that is certainly the twisty bits of garbage collected languages.

pcwalton13y ago

> The parse tree for the closure should be able to ascertain the reachability of the variables in scope so that fences around particular sets of variables can be established.

Sadly it's not quite that simple because of the behavior of `eval` (strictly speaking, direct `eval`): it can leak bindings into the surrounding scope.

jcampbell113y ago

The eval case is already being handled, where the entire local scope automatically captured if there is an eval.

It seems like ChuckMcM is describing is reference counting which variables are needed in the closure. His suggestion would fix this memory issue, but probably have a big performance penalty.

glasserOP13y ago

So you should be able to statically determine if this is the case, and it's not here.

j_s13y ago

Nice! You've explained why this 'memory leak' could be by design - blame eval.

In fact, Chrome GC-ing unreferenced variables could break eval if they didn't cover that case.

2 more replies

ChuckMcM13y ago

33a13y ago· 5 in thread

That is surprising? A setInterval without a matching clearInterval is always a memory leak.

glasserOP13y ago

The surprise isn't that the memory associated with `logIt` is leaked.

The surprise is that `logIt` holds on to the giant `str` object in its environment, despite the fact that there is literally no way for the `str` variable to ever be used again.

glasserOP13y ago

BTW, the original bug in the Meteor codebase wasn't a setInterval.

byroot13y ago

No it's not.

  setInteval(function() { console.log('foo'); }, 1000)

Is not a memory leak.

But I agree that it's not surprising at all. If you can access a variable from somewhere it will obviously not be garbage collected.

mayank13y ago

The point is that you can't access str from elsewhere, which is why it is surprising.

1 more reply

33a13y ago

That allocates a closure. Now if you want it to sit around forever, then hey it is working as intended. Otherwise, it does "leak" some memory that will never get reclaimed by the GC.

glasserOP13y ago· 3 in thread

shardling13y ago

e: Here's an example of one of those corner cases:

    console.log = eval

Now whether those functions reference a particular variable depends on the string they're printing! And this swap out of log() could be done at any time.

e2: Ah, as glasser and pcwalton point out, an indirect reference to eval doesn't work the same way as a direct call. TIL!

glasserOP13y ago

No, actually, you literally have to call eval as a function called eval if you want to get the local lexicals. See sections 10.4.2 and 15.1.2.1.1 of the ECMAScript standard, or try it out:

   > (function () { var x = 5; eval("console.log(x)"); })()
   5
   > (function () { var x = 5; var e = eval; e("console.log(x)"); })()
   ReferenceError: x is not defined

masklinn13y ago

I haven't checked the first two samples but:

* Safari 5.1 exhibits no significant heap growth, as far as the timeline shows anyway

* Using top to observe private memory, Firefox does seem experience significant heap growth on the provided program

richardofyork13y ago· 2 in thread

First, it is not a bug in JavaScript at all.

Here is a technical explanation of why there is a memory leak and how to fix the problem.

Therefore, the activation object of the outer function cannot be destroyed (for garbage collection) after it has returned, because the closure still references its variables.

For a detailed explanation of closures in JavaScript, see "Understand JavaScript Closures With Ease": http://javascriptissexy.com/understand-javascript-closures-w...

nknighthb13y ago

You seem to be describing behavior that the article did not. Specifically, the retention of 'str' because of the 'logIt' function.

If you read more closely, you'll notice that logIt was not the cause of 'str's retention, but instead, the doSomethingWithStr function was.

When logIt is present by itself, str is not retained. Only when doSomethingWithStr is added alongside logIt is str retained.

richardofyork13y ago

nknighthb, I am just wondering. Did you downvote me because you think one part of my explanation was not specific enough?

It is not until both all closures are destroyed or returns that the outer function's scope activation object is destroyed.

1 more reply

wingspan13y ago· 2 in thread

The C# compiler does the same thing:

http://blogs.msdn.com/b/ericlippert/archive/2007/06/06/fyi-c...

This is an old article, but I believe it still applies to the latest iteration of the language.

RyanZAG13y ago

Does indeed still apply, and is actually formalized into the language spec itself [1] in section 7.15.5.1 Captured outer variables. So it's actually a feature (really) in the case of C#, not a bug.

[1] http://www.microsoft.com/en-us/download/details.aspx?id=7029

br113y ago

I don't see how that part of standard mandates that all anomymous functions share the same closure. The example only has a single lambda.

Tloewald13y ago· 2 in thread

It's more of a surprising non-leak caused by clever GC that is insufficiently clever to handle pathological code. Interesting explanation, but yet another reason to be very careful of setInterval.

ufo13y ago

Not only setInterval though. This sort of leak can also show up if the closure is a DOM event handler or is otherwise exposed.

Tloewald12y ago

It can show up anywhere — and in fact event handlers are a major source of leaks in general — but the point is that the leak doesn't occur as frequently as you might expect not that it occurs.

bsimpson13y ago· 2 in thread

I don't know how you can safely declare what's used vs unused in a language where properties can be dynamically accessed.

masklinn13y ago

You can't dynamically access scope objects in standard javascript[0] except by using `eval`, and `eval` already triggers special handling and deoptimizations in pretty much all browsers.

[0] outside of the global scope through `window` but all variables are clearly local here so that doesn't apply.

bsimpson13y ago

Doh! Good point.

btilly13y ago· 2 in thread

Implementing the simplest thing that is correct according to the spec sounds like a good idea to me.

glasserOP13y ago

Yes, but you can statically determine whether or not eval is called.

A world-class JavaScript environment like V8 is far past "the simplest thing that is correct according to the spec".

It just would be even better if they went one step farther.

btilly13y ago

V8 attempts to be fast. There is a trade-off between sophisticated analysis during compilation, and page performance.

I'm not as convinced as you that they haven't found the right balance.

mAritz13y ago· 1 in thread

The fix in https://github.com/meteor/meteor/commit/49e9813 seems to be just limiting the effect of the problem, not removing it.

The variable is still being held in memory just with the value null instead of whatever it was before.

Or does setting it to null trigger some special GC algorithm that detects that it isn't being used anymore?

mistercow13y ago

gsg13y ago· 1 in thread

This problem has been encountered and studied before: see Appel and Zhao's paper, Efficient and Safe-for-Space Closure Conversion.

glasserOP12y ago

dreamdu5t13y ago· 1 in thread

I'm surprised this comes as a surprise to Meteor developers. References retained won't be garbage collected... regardless of whether the lexical scope is "really small" or not, lol.

wtbob13y ago

The surprise is that a non-referenced value isn't collected, not that a referred-to value isn't.

glasserOP13y ago

OP here. Several people have pointed out that of course I should expect a leak, since each `logIt` object is leaked. That's absolutely true; my point was just that we don't want `str` to leak.

    var theThing = null;
    var replaceThing = function () {
      var originalThing = theThing;
      // Define a closure that references originalThing but doesn't ever actually
      // get called. But because this closure exists, originalThing will be in the
      // lexical environment for all closures defined in replaceThing, instead of
      // being optimized out of it. If you remove this function, there is no leak.
      var unused = function () {
        if (originalThing)
          console.log("hi");
      };
      theThing = {
        longStr: new Array(1000000).join('*'),
        // While originalThing is theoretically accessible by this function, it
        // obviously doesn't use it. But because originalThing is part of the
        // lexical environment, someMethod will hold a reference to originalThing,
        // and so even though we are replacing theThing with something that has no
        // effective way to reference the old value of theThing, the old value
        // will never get cleaned up!
        someMethod: function () {}
      };
      // If you add `originalThing = null` here, there is no leak.
    };
    setInterval(replaceThing, 1000);

btipling13y ago

michaelwww13y ago

This article taught me how to find a major leak in my Dart2js code so thanks John McCutchan & Loreena Lee http://www.html5rocks.com/en/tutorials/memory/effectivemanag...

glasserOP13y ago

Looks like Go optimizes this fully, as pointed out by https://twitter.com/nynexrepublic/status/350717895971586049

If you run these on your own machine and peek at the RSIZE, it stays constant (well, the first grows slowly due to `logIt`).

http://play.golang.org/p/A5Pz-3kthP http://play.golang.org/p/RnXr_jB5Qh

finnw13y ago

That is basically how Lua implements closures.

https://bugzilla.mozilla.org/show_bug.cgi?id=542074

doctorpangloss13y ago

Kudos to the excellently talented Meteor team. A great product from a genuine customer (of a free product), who feels great that there is such attention to detail.

oakaz13y ago

The example code looks really silly. As others said, it's just bad programming.

j / k navigate · click thread line to collapse