A nitpick, but my name is Orson Peters, Peters is my last name :)
> I assume that's one of the advantages of building upon powersort? The blocky nature of quadsort is both a blessing and a curse in that regard.
I don't know, I only use a binary search when splitting up merges, and almost no time is spent in this routine. I use this for the low-memory case, as well as to create more parallelism to use for instruction-level parallelism.
> As for instruction-level parallelism, I do think it's actually almost entirely memory-level parallelism. I could be wrong though.
I didn't do specific research into which effect it is, when I say ILP I also mean the memory effects of that.