EDIT: when looking up what I vaguely remembered I somehow managed to come across a similar article that was published just today[1], even though I was referring to an older one[2] which was about microstuttering (basically: a high standard deviation in frame rate). The point still stands - in fact it applies to both cases in somewhat different ways.
To give an example: Crossfire and SLI graphics card setups a few years ago[1]. It turned out that while both gave a similar performance increase in average framerates. Then was it discovered that one of them had a significantly lower minimum framerate than the other. A high minimum framerate is probably more important in shooters than peak performance, but that's not what we've been testing all of these years, is it? That's exactly the problem highlighted in the article by Zed.
I know this is a gaming example, but I'm sure that in user perception of the performance this matters just as much for the responsiveness of webpages.
[1] http://www.tomshardware.com/reviews/graphics-card-benchmarki... [2] http://www.tomshardware.com/reviews/radeon-geforce-stutter-c...