undefined | Better HN

0 pointsacdha8y ago0 comments

One of the other benefits is keeping those notes around in case some assumptions change in the future and the optimization you picked is no longer viable.

0 comments

2 comments · 1 top-level

glangdale8y ago· 1 in thread

Absolutely, yes. Or the optimization landscape suddenly changes. I had a super-cool trick for doing state transitions in a DFA at the rate of the throughput of a shuffle instruction rather than at the rate of the latency of a shuffle instruction, and smugly congratulated myself about how well it worked on Ivy Bridge (latency = 1, reciprocal throughput = 0.5). Then Haswell came along and took over the second shuffle capability to do 256-bit shuffles (latency = 1, throughput = 1). So the clever trick went obsolete overnight. :-(

acdhaOP8y ago

I remember a few stories like that back in the P3/P4/AMD era where a researcher ended up ripping out his hand-tuned assembly because the C reference implementation was increasingly faster. It was really good that they were very conscientious about testing both implementations regularly so there was zero concern about subtle incompatibility.

j / k navigate · click thread line to collapse