You chose an embarrassingly parallel problem, which most programs are not. So you cannot generalize this example across most software. When you try to parallelize a structurally complicated algorithm, the biggest issue is contention. I was leaving this out because it really is a 2nd order problem -- most software today would get faster if you just cleaned up its memory usage, than if you just tried to parallelize it. (Of course it'd get even faster if you did both, but memory is the E1).
> There are many possible answers to this.
How come so few people are concerned with the answers to that question and which are true, but so many people are concerned with making performance claims?