Since it's not needed and it's massively worse than reference counting (assuming you only change reference counts when essential and use borrowing normally) due to the absurd behavior of scanning most of the heap at arbitrary times, there is no Rust GC crate in widespread use.
The problems with GC are threefold, and why you might not want it in a systems language:
1. GC requires more memory than strictly necessary to be efficient (usually about 1.5x - 2x the amount you absolutely need). You're basically trading runtime efficiency for memory.
2. GC performance is harder to predict and reason about than certain other allocation strategies
3. GC languages tend to encourage excessive heap allocation for various reasons, ending up with much more junk than a typical Rust or C program that has a similar amount of entities
Note that item 2 is the one that's least understood. The best part about GCs is that they make heap allocation trivial, and they make de-allocation a no-op. In contrast, both malloc() and free() are extremely complex and costly operations. The GC does impose a cost on every pointer write, similar to (but typically less than) the overhead of Arc<T> over a T*, but that has a very uniform and predictable cost. The problem of unpredictability only comes in the collection phase, and is mostly related to (a) when the collection happens, (b) how much data actually has to be scanned (how many live objects are present on the heap and stack), and (c) what type of collection needs to happen (is it enough to collect from this thread's young generation, or do you need to collect all generations from all threads).
Note that many of these problems are in fact solvable, and there actually exist GCs with constant predictable collection times, suitable even for realtime applications (which malloc/free don't support). They are very sophisticated technology that no one is distributing for free though (e.g. you have to pay Azul for a realtime-compatible Java).
I've worked on tiny embedded systems (.net micro framework) where for a given usage pattern the GC was perfectly predictable, as it should be.
Assuming you use a set of slabs of fixed size objects and keep free objects in a linked list, both malloc and free are trivial O(1) operations.
Destructors with cascading deletions can take time bounded only by memory allocation, but you can solve that for instance by destroying them on a separate thread, or having a linked list of objects to be destroyed and destroying a constant number/memory size of them on each allocation.
I'm not sure what you mean by this being the least understood. It seems like it's very well understood: GC introduces latency if you want good throughput, or it reduces throughput if you want excellent latency (sub-100us is possible).
Of course, that doesn't mean you can predict what specific throughput or latency properties your specific program will have, except for GCs that have maximum bounded latency guarantees.
What do you mean? Aren't GCs both less efficient at run time and use more memory?
Take the extreme case of a program where every allocation is permanent. The GC allocation will be much much faster than malloc() (GC allocation is normally just bumping a pointer to the end of the heap, while malloc() typically does a lot more work to segregate objects by size etc), and collection will never run. So, overall, the program will be significantly faster.
Edit: More realistically, a program that typically produces 1:1 live mmeory:junk, but occasionally spikes, will consume 0 GC time in normal operation (the GC will just let the junk sit there), while free() needs to run on all that junk as soon as it was created, otherwise it leaks forever.
Also, the fact that GCs can defer collection to later than the object going out of scope can often make them more efficient than a traditional free(). For example, if you have a large graph data structure that is now junk, free() needs to do work that's O(size of graph) to free all the nodes. A copying collector, the most popular design, never even looks at the graph, so it frees it in O(1). Edit: of course, the converse is also true: if the graph survives N collections, the copying GC has to do O(N * size) work, where malloc()/free() do nothing at all.
At this point I must mention that paged memory can get you just enough movability that catastrophic fragmentation of this kind is avoided[2] with high probability. Paging tricks can be seriously expensive on a modern multicore processor, though, so I’m not sure this is the way forward. (The paper report only 1% perf overhead for Firefox; I don’t know if that means Firefox is less multithreaded than it could be or what.)
Manual memory management as a solution to the fragmentation problem trades that off, not knowing anything about when free might be called, and so has to lean towards optimising space and immediate time rather than throughout. But there’s still a memory manager behind the scenes that has to deal with fragmentation as well; there’s no get out of jail free card for that, and that complexity is still hidden.
(Helpful memory usage disciplines like arenas/pools have their desirable properties for the same reasons: it’s a discipline on when you free memory in order to avoid fragmentation.)
So, there’s a performance hit of some kind with optional tuning that takes no expertise. Many people would go for those tradeoffs for some applications. Especially if they got to keep using safe, no-GC code in their libraries with their efficiency benefits.
I curate a list of what kinds of ownership people actually want: https://gist.github.com/o11c/dee52f11428b3d70914c4ed5652d43f...
It's been 6 years since I first posted it publicly, and neither I nor anyone giving suggestions has ever actually found a use for GC.
Rust uses plenty of reference counting. You could replace most of that with a GC; depending on your GC implementation and your reference counting implementation, and your requirements in terms of latency and throughput and ability to handle cycles.
> Since it's not needed and it's massively worse than reference counting (assuming you only change reference counts when essential and use borrowing normally) due to the absurd behavior of scanning most of the heap at arbitrary times, there is no Rust GC crate in widespread use.
There are choices. You can have real time garbage collectors. And if you want to handle cycles, you need some kind of tracing anyway.
Also, if you only care about throughput and not about latency, garbage collection can easily be faster than reference counting.
If you don't allow cycles, you can also take advantage of that in your garbage collection. See how Erlang does it. (In Erlang, the garbage collector relocates your objects so that they are in topological order, so all references only point forwards in memory. You can combine that with generational gc, too, of course.)
Lol, what? Maybe don’t go asserting stuff you clearly know little about. Reference counting is a fine tradeoff for manual memory-managed languages, but it is absolutely smoked out of the water by a tracing GC on most counts. It’s almost like JVM, V8, etc engineers know a thing about the topic and don’t have RC for a good reason.
Tracing GC doesn’t burden the mutator threads with additional work, almost everything can be done in parallel, resulting in vastly better throughput. Imagine dropping the last reference to a huge graph, one can actually observe it when exiting a c++ program, it might hang for a few seconds before returning control to you, as all the destructors are recursively called, serially, on the program thread, literally pointer by pointer jumping across the heap, the very thing you are so afraid of. And I didn’t even get to the atomic part, bumping a number up or down with synchronization between CPUs is literally the slowest operation you can do on a modern machine. Tracing GCs elegantly avoid all these problems at the price of some memory overhead. None of these GC algorithms (yes, RC is a GC) is a silver bullet, but let’s not joke ourselves.
If you use reference counting properly in a well-designed language then it's obviously better than GC since it's rarely used, fast, simple, local and needs no arbitrary heuristics.
The destructor cascades are only a problem for latency and potential stack overflow and can be solved by having custom destructors for recursive structures that queue nodes for destruction, or using arena allocators if applicable.
Like, as I explicitly wrote, it is probably the correct choice for low-level languages close to the metal, that want easy compatibility with other languages through FFI. But the method itself has still got a much slower throughput than a tracing GC, when used in a similar manner. Anything else is a useless comparison, like is a bicycle better than a tank.