In addition to all the security problems inherent in mingling metadata directly next to user-data, the linked list gives O(Nalloc) performance on free(3) and realloc(3) whereas my malloc had O(1) performance.
That may not seem like a lot, but when the 'O' is a page-in from disk, and Nalloc is from C++ code, it is nothing to sneeze at.
Not only did that make my malloc faster, but it would, not "could", but "would" detect several classes of malloc-usage errors, double-free etc, unconditionally.
Over the next 10-ish years, that practically eliminated entire classes of malloc-mistakes, making a lot of FOSS software safer, no matter which malloc you used it with.
Given the definition of the malloc(3) family API, there is no way you can detect all corruption without hardware support, but there are people working on that too, notably Robert Watsons CHERI project at Cambridge.
So yeah, nice pointer arithmetic, but how about people solved the real problem instead ?
(On big memory and multi-core systems use jemalloc, on small memory and single-core systems use phkmalloc.)