The reverse (running on bare metal and tweaking an existing malloc to run on it) looks way more logical to me.
> No other system lets you avoid the libc.
On most OSes [1] it’s relatively easy to write a libOS that just wraps the system calls. Your only dependency would be on the mapping to syscall numbers.
[1] OpenBSD is/may be (I don’t know the status of these) an exception. See https://man.openbsd.org/pinsyscalls.2, https://lwn.net/Articles/949078/