If you use a custom quicksort implementation and put it into the same translation unit as the comparison function (or compile statically and use link-time optimizations with a sufficiently advanced compiler) you can get the same performance out of C.