https://probablydance.com/2020/08/31/on-modern-hardware-the-...
heaps of comments in the 2020 HN thread: https://news.ycombinator.com/item?id=24353423
I would perhaps attempt to gather more non functional requirements. How many items on the queue? How quickly are they added? How many threads are consuming from the queue? What’s the size of the message?
Also, how much would all this cost to run, in terms of memory? Is there a way of persisting the in-memory structure in case the server goes down?
> The author clearly understands the pattern and the technologies they’re working with.
We must have read a different article.
The one I read put a big lock around the outside of their interface and called that "concurrent". This is not what is normally meant by the term.
This was after a dozen or so pages that described an insertion sort poorly, and followed by some speculation about why their software still didn't work right.
one downside of linked lists is that each node in the list can reside in an arbitrary part of memory. in the worst case each traversal of a link is a cache miss, so you sit around for 100 or 200 cycles waiting for a main memory read. similarly if all the values in the list that are used to implement comparisons are also stored by reference and scattered throughout memory.
with heaps, another interesting thing to try is switching from a binary heap to e.g. a 4-ary heap (i.e. 4 children per parent). i've been fiddling with heaps recently for a priority queue & see a decent speedup switching to a 4-ary heap -- provided the "heapify down" logic that runs whenever you pop an element is structured to let the cpu do as many pairwise compares as possible in parallel, rather than structuring it as a big chain of dependent computations. during "heapify" operations you're jumping around inside a heap a fair bit, which may also cause cache misses, but each bunch of children will be arranged linearly in memory, so there is a bit more signal to noise of each cache line read.
Second, it has a `O(N)` performance on inserts.
This is beyond unexpected for something that is supposed to be a queue.