.NET CLR has the exact same problem (perhaps a harder one, since CLR has a moving GC), so anytime they touch GC references (pointers to objects that are collectible) it's always wrapped in an explicit GC stack frame (think GC struct that lives on the stack). Furthermore, all reads/writes are carefully done with macros (which of course expands to volatile + some other stuff) to make sure the compiler doesn't optimize it away.
On the one hand, this is nice because they don't need to scan the C-stack (it scans the VM stack and the fake GC frame stacks -- well it's one stack but you skip the native C frames), on the other hand this means that any time a GC object is used in C code (ok, actually it's C++) they have to be real careful to guard it.
Of course bugs crop up all the time where an object gets collected where it shouldn't have, it happens so often that there is a name for it -- "GC Hole".
Astute readers and users of p/invoke may remark that they don't have to set up any "GC frames" -- that is because this complicated scheme is not exposed outside of the CLR source. Regular users of .NET who want to marshal pointers between native/managed can simply request that a GC reference gets pinned, at which point I'm mostly sure it won't get collected until it's unpinned.
The bad news is I'm almost positive there is nothing you can do with just C here to make this problem go away. You'd want stuff to magically just happen under the hood, and C++ is the right way to go for that.
It's probably possible to create an RAII style C++ GC smart pointer that would be 99% foolproof at the expense of some performance. It gets a little bit trickier if we are doing a moving collector. I am thinking it could ref/unref at creation/destruction, and disallow any direct raw pointer usage not to shoot yourself in the foot.
Of course the people writing the GC still need to worry about this..
What makes it 1/ irreversible and 2/ bad for today's users?
EDIT: as well, I wouldn't stop using Ruby because of that; I would use JRuby or Rubinius or IronRuby (if I understand well, these ones are not affected?)
A plausible rewrite of that function in an XS for ruby would leave the function declaration and wrapper code up to your equivalent of xsubpp to execute your DSL and transform the wrapped code to fully functional C. If you build a C using extension from Perl, you'll find an XS file like http://cpansearch.perl.org/src/SIMON/Devel-Pointer-1.00/Poin... which during the `perl Makefile.PL && make` step is transformed via `xsubpp Pointer.xs > Pointer.c` and then compiled as normal C.
Shit! MRI/YARV/REE are inherently fatally flawed! All that code I have running in production must be a FIGMENT OF MY IMAGINATION! SAVE YOURSELVES
Yours in perpetual bogglement,
Lil' B
If rubinius/ironruby/jruby have no issues, this may become moot eventually as rubinius is gaining lots of traction recently and is becoming faster by the release outperforming standard ruby vms in many cases.
However, I would like to see Matz' response to the recommended steps for a fix at the end. Sounds like a reasonable goal to add for Ruby 2.0.
Note to self: Listening to Papoose while writing a technical blog post turns your otherwise important observations into a Chicken Littleish, end-of-the world rant.
I don't intend this to be an inflammatory question, I'm sort of a perpetual ruby novice, it's never been my day job and I've never managed to sort of catch up with the community, as soon as I feel pretty good with something I find it's been obsoleted a couple times. I like it but how does the community at large deal with stuff like this? This guy found a real bug and invested some time in it, do other rubyists just deal with crashes and restart their stuff? Do they just consider it part of "being on the cutting edge?" Or do they not even notice?
That's what makes the hyperbolic tone of this article so douchey; he wrote up an interesting dissection of an edge case issue as though it were an ongoing catastrophe, mostly just to inject a bunch of chest-thumping rock-star bravado that added nothing of value to the discussion.
This is the implementation of `select.epoll`. Somethings you'll notice there's no GC details (allocations outside the GC of C level structs are handled nicely with a context manager), and we have a declarative (rather than imperative) mechanism for specifying argument parsing to Python level methods, this ensures consistency in readability as well error handling, etc.
That said, I like Handle, the RAII thing that V8 uses. It also allows for compacting collection. Too bad C doesn't do RAII.
[1] http://www.shafqatahmed.com/2008/05/memory-control.html
[2] http://publib.boulder.ibm.com/infocenter/javasdk/v5r0/index....
Faulty assumption seems to be that counting references only to RVALUEs (Ruby objects in heap) is enough to determine if a part of memory can be freed. This breaks down in C-extensions where macros extract some part of the object or something pointed by it for use. In this case RSTRING_PTR extracts the C char-array used by str for zstream_append_input to use (lets call it arr).
If zstream_append_input or any calls underneath it tries to allocate a new Ruby object, GC may get called and str (and thus arr) may get freed because there are no references left to it anymore (no heap/stack/register because the register value was overwritten).
And this seems to require all Ruby C-extension writers to lock the objects they're using through macros with RB_GC_GUARD.
Edit: note that there are no references left to str
The Ruby C API is returning objects that are not correctly reference-counted for a short period of time and are incorrectly subject to GC.
This doesn't seem fatal to me, just not reasonably fixable from the GC side. It might be true, that a new API is needed to hold refs in the C side.
Funktacularly yours,
Lil' B
BTW the CLR is not a good alternative runtime for Ruby, might not ever be: http://www.zdnet.com/blog/microsoft/whats-next-for-microsoft...
You did good work here -- don't hurt your credibility with overstatement.
"Volatile" is the wrong fix, by the way. That's just depending on yet another non-required behavior. There is in fact no further reference to "str" between the function call and the reassignment at the start of the next iteration, so there's nothing for "volatile" to chew on. This particular version of this particular compiler just happens to add an extra pair of stack operations in this case, but it's not truly required to. A real fix would not only mark the variable as volatile but also add a reference after the function call. The same "(void)str;" type of statement that's often used to suppress "unused argument/variable" warnings should count as a reference to force correct behavior here.