Oh, so now it's about "efficiently". Goal posts have moved.
No, it's not hard at all to implement efficiently as long as objects don't cross a thread boundary (and e.g. Nim's older GC used to enforce that condition, an old version of K tracked it and switched to "lock; xadd" to count references when something did cross a thread boundary IIRC, which made it inefficient only for those objects that crossed the boundary which usually weren't many.
It's way simpler than multithreaded mark&sweep, for example. Regardless - it makes a lot of sense. It might not make a lot of sense to you, but it does in general in most contexts.