(Like how linear search wins for small N, or insertion sort helps with small subarrays in introsort: more steps, but each step is also a lot cheaper on modern hardware due to fewer branch mispredictions, better cache locality, or something else that better fits the hardware's strengths.)
(And before anybody gets pedantic: technically, this is the node's "outdegree" when the tree is represented as a directed graph. If it was an undirected graph, we would have to count the parent node as well.)
Things not taught in CS that you need to know on the job are legion. By far the most important of these is the use of invariants. Second might be the memory hierarchy, and cache semantics. Third might be use of bitmaps and bitwise operations.
So, it's nice to see a detailed analysis of the structure like this! Perhaps if it becomes more popular, I will find more places to use it.
You don't need C++20. Even C++98 had std::bitset<>::count(), which has a nice conversion from unsigned long, and which, when compiled for SSE3 or better (e.g. Core2), uses a POPCNT instruction. It is pretty simple to produce various other results, including countl_zero, from a popcount, with just a couple of additional bitwise ops.
Modern compilers are happy to keep a bitset<32> in the same register as an unsigned long, so the conversion takes exactly zero cycles. POPCNT takes three cycles, and the extra bitwise ops another couple.
So in that case I would use this data structure even if it weren't faster. I can't count the number of times I have had to mess with inserting negative priorities into a min heap to create a max heap! We should just have one data structure that does both.
(though taking this idea to the logical extreme means we should just use Order Statistic Tree for everything since it not only gives you log(n) min/max, but also log(n) find kth and get rank of x)
That was waaay at odds with the mindset of the STL, which seemed to me and I think a lot of folks like the state-of-the-art of containers at the time: with the STL if you prepend/insert much you need to use a deque/linked list, you decide pointer vs. value each time you declare a type, and if you want to know about the constant factor on an insert in the middle of the list you probably already lost.
(Qt has/had some other fun pragmatic things, like copy-on-write types in a lot of places. I didn't really do nearly enough with it to discover the pitfalls, but was fun to read.)
So I think about that, about how there are likely some trees out there that could just as well be sorted arrays, about adaptive structures that change their behavior based on dynamically observing how they're used (one recently discussed here: http://blog.erlang.org/the-new-scalable-ets-ordered_set/), even about tricks in in JS string implementations and VMs. I don't have answers, but I sometimes wonder if we should be thinking more in terms of tools that may not be perfect on all axes but soften or eliminate performance cliffs we take for granted. Not that the zero-overhead fine-tuned types are going away, just that they aren't the only imaginable approach.
I wonder if we'll see progress in those sort of imperfect-but-resilient tools over time. I think the long-run trend is towards some form of programming being accessible to more people more easily (like how, as you note, folks do real work in Python now). That certainly fits with trying to make your tools resilient to "misuse"--or, better, properly supporting more patterns so they're not even misuse. No conclusions, just hand-waving, but I do wonder if anyone working in software construction tools will eventually more usefully wave their hands in that direction :P
It supports O(1) push_front/pop_front/push_back/pop_back/operator[]/at (like you would expect from a deque) but also O(sqrt(N)) insert and remove from middle!!!
The paper was from 1999 and 2001 but I only learned about it from a recent HN post[3] where some guy rediscovered it (20 years too late). I still wonder why this design lost to the std::deque we have in STL today.
[1] http://www.cphstl.dk/Report/Deque-first-attempt/cphstl-repor...
V8's Array does different things depending on whether the array is sparse or packed and what data types it contains.
The real problem is if you need min/max at the same time. Then you not only need a min heap and max heap, you also need to track what was already popped from either heap! This is because in python you can't delete an element if it's not at the root. So tracking "tombstones" will let you pop from one heap, then know to ignore it the next time it comes up in the other heap.
This is of course super complicated to maintain and I get it wrong all the time. Luckily only shows up in algo interview problems.
Taking 27 milliseconds just to start the interpreter.
Very useful when you need it!
https://en.wikipedia.org/wiki/D-ary_heap
(Also I didn't know they were called d-heaps, thanks!)