I understand it's tradeoffs and we all have real-world limitations to contend with -- but again, of all the corners that could be cut that's exactly the one I didn't imagine they would.
Nasty.
One of them was unburdened by any thought of freeing stuff, and relied entirely on the application exiting for cleanup. This was very convenient to work with, and never ended up posing an issue.
Another used a series of allocation arenas, where certain arenas would be cleared at certain points in the compiler pipeline. This made for both speedy alloc/freeing and avoided leaks, since you weren't at risk of "forgetting" a data structure. It was also a major headache to keep track of exactly what the longest lifetime of a long-lived datastructure might be, and to pick an arena that won't be cleared in the meantime. Unfortunately the programs compiled with this compiler were large enough that we certainly couldn't have gotten away with just leaking memory; we sometimes OOMed as-is!
The third used standard C++ memory management. This compiler was quite simple, and the vast majority of its data used stack-based lifetimes. For a more complex compiler this would've become a headache.
I think that all of these compilers chose the correct allocation strategy for what they were doing. "Good practices" aren't as universal as we might like to believe, they depend entirely on the context in which a tool is designed to operate. And yes, we can guard to some extent against that context changing, but for the most part that's why we keep getting paid.
I was taught -- including at the start of my career when I used exclusively C/C++ (about 18.5y ago) -- to take care of all resources I was using and not rely on runtimes.
I understand and appreciate different usages but to me doing a proper cleanup was the sane default for most programmers. And that's all what I was saying.
Obviously, as one digs deeper in a specialised area where more and more efficiency is demanded then they have to reach for tools that most of us wouldn't normally. That's quite normal and was always interesting for me to read about.
I don't think it's even that uncommon. I believe some HFT firms run Java with a huge amount of RAM and GC disabled, and get around it by just rebooting the software occasionally.
To me writing software like that is fair game, I don't see the point in being dogmatic about "how things should be done".
I recall reading somewhere, years ago, that some OSes couldn't be relied upon to release unfreed memory when a process terminated. In those contexts, fastidious freeing would be important even in short-lived processes.
In my own C code I tend to free everything so that I get a clean trace from Valgrind and don't risk masking legitimate bugs, but I typically write long-running daemons.
For specialised apps and servers it's of course a perfectly good practice.
But still, in a world where languages and runtimes are also judged by their ability to run in lambda/serverless setups, I'd think this practice will start being obsolete, wouldn't you think?
(What I mean is that I imagine that any serverless function that runs in severely constrained and measured environments like the AWS Lambda would gain a significant edge over the competition if it did an eager cleanup. Should allow more of them to work in parallel?)
And how many millions of iterations have been done successfully in that "awful" system?
The very fact that you never imagined it I think says a lot.
As I acknowledged in other comments of mine downthread, I understand that different situations require different tradeoffs. It's just that forgoing memory deallocation wasn't one of them in my head.
We wrote the entire thing (and tested, using ASAN and fuzzers and other techniques) to avoid leaking memory, and then strategically inserted [the equivalent of a rust `mem::forget`](https://github.com/sorbet/sorbet/blob/0aae56e73c7680ec6053b3...) into the end of the `main` driver during standalone mode, to avoid calling those destructors when we're about to exit anyways.
This optimization is definitely still relevant for new systems today.
We don't go around saying "oh, you didn't want modulo 5 arithmetic? You should've put that in the spec, not rely on some contrived absolute truth".
If this is incorrect, then every modern malloc implementation is incorrect.