Worst case, 2× is actually a steal compared to a non-moving allocator such as malloc/free or RC built on top of that, which cannot[1] do better than about 1 + ½log₂(largest allocation / smallest allocation). For example, if you have allocations from 1K to 16K bytes, any malloc/free implementation can require at least 3× the memory you actually use; if from 16 to 16K, 6×; etc.
At this point I must mention that paged memory can get you just enough movability that catastrophic fragmentation of this kind is avoided[2] with high probability. Paging tricks can be seriously expensive on a modern multicore processor, though, so I’m not sure this is the way forward. (The paper report only 1% perf overhead for Firefox; I don’t know if that means Firefox is less multithreaded than it could be or what.)
[1] https://www.sqlite.org/malloc.html
[2] https://github.com/plasma-umass/mesh