https://github.com/llvm/llvm-project/blob/release/21.x/llvm/...
https://github.com/llvm/llvm-project/blob/release/21.x/llvm/...
1. organized data flow analysis
2. recognizing a pattern and replacing it with a faster version
The first is very effective over a wide range of programs and styles, and is the bulk of the actual transformations. The second is a never-ending accumulation of patterns, where one reaches diminishing returns fairly quickly.
The example in the linked article is very clever and fun, but not really of much value (I've never written a loop like that in 45 years). As mentioned elsewhere "Everyone knows the Gauss Summation formula for sum of n integers i.e. n(n+1)/2" and since everyone knows it why not just write that instead of the loop!
Of course one could say that for any pattern, like replacing i*2 with i<<1, but those pattern replacements are very valuable because they are generated by high level generic coding.
And you could say I'm just being grumpy about this because my optimizer does not do this particular optimization. Fair enough!
gcc was and is an incredible achievement, but it is traditionally considered difficult to implement many modern compiler techqniques in it. It's at least unpleasant, let's put it this way.
https://github.com/gcc-mirror/gcc/blob/master/gcc/tree-chrec... https://github.com/llvm/llvm-project/blob/release/21.x/llvm/...
On the other hand, I find MSVC and especially ICC output to be quite decent, although I have never seen their source code.
Having inspected the output of compilers for several decades, it's rather easy to tell them apart.
If you mean graph coloring restricted to planar graphs, yes it can always be done with at most 4 colors. But it could still be less, so the answer is not always the same.
(I know it was probably not a very serious comment but I just wanted to infodump about graph theory.)
A hard problem in optimization today is trying to fit code into the things complex SSE-type instructions can do. Someone recently posted an example where they'd coded a loop to count the number of one bits in a word, and the compiler generated a "popcount" instruction. That's impressive.
This kind of optimization, complete loop removal and computing the final value for simple math loops, is at least 10 years old.
In topics like compiler optimization, is not like there are many books which describe this kind of algorithms.
Seems like the author is both surprised and delighted with an optimization they learned of today. Surely you’ve been in the same situation before.