undefined | Better HN

0 pointsjcranmer10y ago0 comments

Actually Duff's device is very likely to be a performance hit most of the time. It will have bad branch predictor behavior and it creates unstructured code, making it harder for the compiler to reason about and optimize.

Although it should be noted that the original person to use Duff's device did admit that he tried everything else to optimize it and only that worked, and he never advocated that people should use it as a general optimization technique.

0 comments

4 comments · 2 top-level

sago10y ago· 2 in thread

That's been my experience too, not just of Duff's device but in general with trying to unroll loops. There may have been a point where it was practical (although early in my career I didn't do the profile-first-optimise-later thing very well, I confess). But if you think you can manually unroll a loop in a way that beats the branch predictor and a good compiler on most modern hardware, you're probably fooling yourself.

nwmcsween10y ago

Eh, this is still relevant, loop unrolling can still have a drastic effect if the compiler cannot optimize.

sago10y ago

I suspect I'm just not in the group of people for whom it is relevant any more. Way back I was optimising for console hardware when that hardware was quite rudimentary. Now the hardware is much more sophisticated, I don't need it.

Can you say what kind of stuff and in what kinds of situations you unroll loops. Just out of curiosity.

adrusi10y ago

It's worth noting that there are definitely exceptions where loop unrolling is a good practice for optimization. When writing code that will run on obscure old architectures, and even some common ones, the CPU doesn't have things like branch prediction and caches, and your compiler is likely quite rudimentary.

As always, profiling is key.

j / k navigate · click thread line to collapse