undefined | Better HN

0 pointsnh21y ago0 comments

It's important to be aware that often it isn't the programming language that has the biggest effect on memory usage, but simply settings of the memory allocator and OS behaviour.

This also means that you cannot "simply measure memory usage" (e.g. using `time` or `htop`) without already having a relatively deep understanding of the underlying mechanisms.

Most importantly:

libc / malloc implementation:

glibc by default has heavy memory fragmentation, especially in multi-threaded programs. It means it will not return `malloc()`ed memory back to the OS when the application `free()`s it, keeping it instead for the next allocation, because that's faster. Its default settings will e.g. favour 10x increased RESident memory usage for 2% speed gain. Some of this can be turned off in glibc using e.g. the env var `MALLOC_MMAP_THRESHOLD_=65536` -- for many applications I've looked at, this instantaneously reduced RES fro 7 GiB to 1 GiB. Some other issues cannot be addressed, because the corresponding glibc tunables are bugged [2]. For jemalloc `MALLOC_CONF=dirty_decay_ms:0,muzzy_decay_ms:0` helps to return memory to the OS immediately.

Linux:

Memory is generally allocated from the OS using `mmap()`, and returned using `munmap()`. But that can be a bit slow. So some applications and programming language runtimes use instead `madvise(MADV_FREE)`; this effectively returns the memory to the OS, but the OS does not actually do costly mapping table changes unless it's under memory pressure. As a result, one observes hugely increased memory usage in `time` or `htop`. [2]

The above means that people are completely unware what actually eats their memory and what the actual resource usage is, easily "measuring wrong" by factor 10x.

For example, I've seen people switch between Haskell and Go (both directions) because they thought the other one used less memory. It actually was just the glibc/Linux flags that made the actual difference. Nobody made the effort to really understand what's going on.

Same thing for C++. You think without GC you have tight memory control, but in fact your memory is often not returned to the OS when the destructor is called, for the above reason.

This also means that the numbers for Rust or JS may easily be wrong (in either direction, or both).

So it's quite important to measure memory usage also with the tools above malloc(), otherwise you may just measure the wrong thing.

[1]: https://sourceware.org/bugzilla/show_bug.cgi?id=14827

[2]: https://downloads.haskell.org/ghc/latest/docs/users_guide/ru...

0 comments

Capricorn24811y ago

Why does no one ever talk about this? It is so weird to see a memory pissing match with no context like this. Thank you

nh2OP1y ago

Because people don't know.

That includes users of low-level languages. They assume free() means free when it doesn't.

And assumption- and hope-driven development are less bothersome to the mind!

It's annoying to have to fact-check every sane assumption, but unfortunately it's required. Of course for anything that exists, somebody somewhere built a cache around it for average-case performance gains that destroys simplicity and adds pathological edge cases.

Most people learn this only when they try to run a real-world system that inexplicably runs out of RAM, or if they see unreasonably large number and actually start digging instead of just accepting it.

j / k navigate · click thread line to collapse