> pdqsort is a hybrid sort; when it encounters apparently pathological input, it switches from a strategy resembling quicksort to heapsort—it doesn't share quicksort's worst case characteristics.
I understand how hybrid sorts work. I thought that would be clear enough. I guess when you're a stranger on the internet, people don't automatically trust you to know what you're talking about. I imagine this is even truer when you say something counterintuitive or controversial. In spite of learning that, I'm responding chiefly because of that quote.
> Your thesis is wrong:
>
> Heapsort has a variable runtime
Let's not get hung up on heapsort. My thesis is if you have an algorithm with a gap between the worst case and average case, then you shouldn't use it on the web. That gap might not manifest as gap in the big O -- it could be 4x slower in the worst case than the average case, and that would still be bad.
To see why, try imagining this world of pure evil:
You create a website that uses gapsort. Gapsort is an algorithm that is 4x slower in the worst case than it is in the average case, for a fixed value of n (the size of the input). On top of that, triggering the worst case by accident is hard. You deploy your site for several months, buy servers, and get a userbase. In your performance testing, you only encountered the average case, so you only provisioned hardware for the average case. Now imagine that suddenly all your users turn evil, and start triggering the worst case; this leads to a sudden 4x performance drop. You may have underprovisioned for this.
So my thesis is it looks like having differences from the worst case and average case is great, on average. But in a hostile world, that actually makes things more complicated. When doing empirical performance testing, you'll test for the average. Moreover the gap can be just 4x; this will not manifest as a difference between the big Os of the worst and average cases (as I've said previously).
Changing from quicksort to heapsort on some inputs may manifest as a performance gap between the QS case and fallback case. Maybe that's not such a good idea. In fact, maybe you shouldn't use any quicksort variant, even introsort, in a potentially hostile environment, because of the unavoidable average-to-worst-case gap. In introsort, that gap is merely a different coefficient, but it's still a gap.
I hope that wasn't too repetitive.
[edit] I deleted and reposted this because the edits weren't being applied.
[edit] Done it twice now.