The extensive compile-time metaprogramming facilities in C++ give it unique performance advantages relative to other performance languages, and is the reason it tends to be faster in practice.
For example, compare the speed and implementation of std::sort and qsort (it's almost an order of magnitude difference in run time for big N!)
Also, sorting is something where algorithmic improvement makes a sizeable difference so you need to be sure you're either measuring apples vs apples or that you've decided up front what your criteria are (e.g. lazy people will use the stdlib so only test that; or nobody sorts non-integer types so I only test those)
For some inputs if you're willing to use a specialist sort the best option today is C. If you care enough to spend resources on specialising the sort for your purpose that's a real option. Or alternatively if you can't be bothered to do more than reach for the standard library of course Rust has significantly faster sort (stable and unstable) than any of the three C++ stdlibs. Or maybe you want a specialized vector sort that Intel came up with and they wrote it for C++. Hope portability wasn't an issue 'cos unsurprisingly Intel only care if it works on Intel CPUs.
Sure, if you write all the code. If you're writing a library or more generic functions, you don't have that power.
And, even then, while you can do this it's going to be much more code and more prone to bugs. C++ is complex, but that complexity can often bring simplicity. I don't need to specialize for int, double, float, etc because the compiler can do it for me. And I know the implementation will be correct. If I specialize by hand, I can make mistakes.
In addition, this isn't something where C "shines". You can do the exact same thing in C++, if you want. Many templates have hand-rolled specializations for some types.
> apples vs apples
It is, they're both qsort. When every single comparison requires multiple dereferences + a function call it adds up.
> For some inputs if you're willing to use a specialist sort the best option today is C
I don't understand how. Even if this is the case, which I doubt, you could just include the C headers in a C++ application. So, C++ is equally as good of a choice + you get whatever else you want/need.
> Rust has significantly faster sort (stable and unstable) than any of the three C++ stdlibs
Maybe, but there's a new std::sort implementation in LLVM 17. Regardless, the Rust implementations are very fast for the same reason the C++ implementations are fast - encoding information in types at compile-time and aggressively inlining the comparison function. Rust has a very similar generic methodology to C++.
Yes, but not if you pass in void *. For libraries this matters. If you're both writing the producer and consumer then sure, you can do it manually.
> code bloat caused by monomorphization
This is true and a real problem, but I would argue in most scenarios extra codegen will be more performant than dynamic allocation + redirection. Because that's the alternative, like how swift or C# or Java do it.