My favourite example: multiplication of very large square matrices. The algorithms with the best time complexity are never used in the real world. Under no realistic circumstances would they offer the best performance.
In that case, it's not cache behaviour or branch-prediction that dooms the algorithm, but very intensive steps in the algorithm that are nonetheless constant-time, and so don't impact the complexity theoretic properties.
Do we describe such algorithms as the 'fastest'? I wouldn't. They have the lowest time complexity. Not the same thing.