Imo separate heaps is the first big mistake. Even implementations like Erjang (Erlang on the JVM using the Kilim microthreading library -- which I've also contributed to) improve on the copy-from-heap mechanism prevalent in vanilla Erlang. Not only that, but Erlang's memory allocator isn't that well-suited for multi-threaded allocations, which also means that Erlang doesn't (can't?) take advantage of tcmalloc, umem, hoard, etc.