Spaces takes a different approach. It uses 64KB-aligned slabs, and the metadata lookup is just a pointer mask (ptr & ~0xFFFF).
The trade-off is that every free() incurs an L1 cache miss to read the slab header, and there is a 64KB virtual memory floor per slab. But in exchange, you get zero-external-metadata regions, instant teardown of massive structures like ASTs, and performance that surprisingly keeps up with jemalloc on cross-thread workloads (I included the mimalloc-bench scripts in the repo).
It's Linux x86-64 only right now. I'm curious if systems folks think this chunk API is a pragmatic middle ground for memory management, or if the cache-miss penalty on free() makes the pointer-masking approach a dead end for general use.
```c ((PageSize) (chunk->pageSize - ((PageSize) ((PageSize) ((PageSize) (sizeof(Page) + (sizeof(struct _Block))) + (PageSize) ((sizeof(double)) - 1u)) & ((PageSize) (~((PageSize) ((sizeof(double)) - 1u)))))) - ((PageSize) ((PageSize) ((PageSize) ((sizeof(FreeBlock) + sizeof(PageSize))) + (PageSize) (((((sizeof(double)) > (4)) ? (sizeof(double)) : (4))) - ```
[0]: https://github.com/xtellect/spaces/blob/422dbba85b5a7e9a209a...
[1]: https://github.com/xtellect/spaces/blob/422dbba85b5a7e9a209a...
[2]: https://github.com/xtellect/spaces/blob/422dbba85b5a7e9a209a...
My hunch tells me it may be the result of macro-expansion in C (cc -E ...), etc. So it's likely there's a larger code base with multiple files and they expanded it into a one large C file (sometimes called an amalgamation build) and called it a day.
By they, I mean the OP, a script or an AI (or all three).
mspaces was one mutex per heap for entire task (no tlc or lockfree paths). Spaces has per thread-heaps, local caches (no atomic ops on same thrad alloc/free), and a lock-free Treiber stack (ABA tagging) for cross-thread frees. mspaces doesnt track large allocs (>= 256 or 512kb) that hit mmap, so unless one knows to explicitly call mspace_track_large_chunks(...), destroy_mspace silently leaks them all (I think obstacks is good this way but is not a general fit imo). In Spaces, a chunk_destroy walks and frees all the page types unconditionally.
Another small thing may matter is error callbacks: Spaces triggers a cb allowing the application to shed load/degrade gracefully. Effectively, the heap walking (inspection?) in msapces is a compile-time switch that holds the lock whole time and doesnt track mmap (direct) allocs, and shares the thresholds like mmap_threashold, etc. globally, whereas Spaces lets you tune everything per-heap. So I'd say Spaces is a better candidate for use cases mspaces bolts on: concurrent access, hard budgets, complete heap walking and per-heap tuning.
I don't see how that could possibly be true. Sounds like a low-ball estimate.
Also i wish to point out that the "tcmalloc" being used as a baseline in these performance claims is Ye Olde tcmalloc, the abandoned and now community-maintained version of the project. The current version of tcmalloc is a completely different thing that the mimalloc-bench project doesn't support (correctly; I just checked).
I can also tell you that this was written with Claude.
No issues with that in principle but I definitely would not trust Claude to get this stuff correct. Generally, it is quite bad at this kind of thing and usually in ways that are not obvious to people without experience.
Edit: Homie. Why is bench.sh fetching external resources? Call me old fashioned, but it would be nice if when I cloned the repository (and checked out any submodules that may exist), I've got everything I need, right there.