> [...] variety of front ends: languages with compilers that use LLVM include ActionScript, Ada, C#, Common Lisp, Crystal, CUDA, D, Delphi, Dylan, Fortran, Graphical G Programming Language,Halide, Haskell, Java bytecode, Julia, Kotlin, Lua, Objective-C, OpenGL Shading Language, Ruby, Rust, Scala, Swift, Xojo, and Zig.
Sometimes I think mention of Rust in the title just gets upvotes without even reading the article, here.
In poorly performing languages like Python and Ruby calling a method on a value (like sum() or even the += operator) essentially means looking up the method name in a hash table, and then doing a dynamic function call on the resulting function pointer, so this sort of optimizations cannot be done in an ahead-of-time compiler unless all objects are created locally and thus have known method dictionaries (and even then, it depends on whether they are available to the compiler in IR form, whether the lookup code is inlined and whether the code can actually be simplified).
Well, you do need something very much like this if you want to support "openly extensible" classes in a backward- and forward-compatible way. Otherwise you end up with something like the C++ ABI hell where changing things around in a base class breaks every binary that links to earlier versions of that class. This is only a "poorly-performing" pattern inasmuch as the developer is not enabled to "opt out" of that extensibility when that's the sensible choice.
TruffleRuby has been constant folding trivial code like this since 2014. AOT is also an option.
https://rust-lang.github.io/rustc-guide/miri.html
So while, yes, in general pretty much all rust optimizations are actually due to LLVM, this one in particular might not be
Byte for byte identical assembly.
.NET Core and .NET Framework, do not.
What I’m interested is how the IL code for the compiler frontends for each of those languages are more well-optimizable for LLVM (in practical situations, not just a few lines of simple numerical code.) I’ve heard that you need to be careful about encoding IL code in the right way such that LLVM does not generate needless memcpy’s or something but I’m not that much of a compiler expert...
The zero cost abstractions are the fact that you get the same output, special optimization and all, when you do the original loop, or when you use the "(1..n).fold(0, |x,y| x + y)" variant, or "(1..n).sum()" variant. Rust is converting those to intermediate code that is the same, or close enough, that LLVM is able to apply the same optimization.
Would it be better is some sample code was chosen that wasn't special cased to the degree that the entire algorithm was replaced wholesale? Probably. It doesn't invalidate the premise though, just slightly obscures it.
What you don’t use, you don’t pay for. And further: What you do use, you couldn’t hand code any better.
Rust iterators fit in the second part.
1: https://boats.gitlab.io/blog/post/zero-cost-abstractions/
Compilers are awesome, check out Matt Godbolt's talk on this very topic: https://www.youtube.com/watch?v=nAbCKa0FzjQ
I can't figure out why it doesn't use the simpler formula (other than the optimizer being bad).
Seeing what
pub fn sum3(n: i32) -> i32 {
(1..n).sum() + (1..2*n).sum() + (1..(n + 2)).sum()
}
does would be more interesting to me.Also, while all the inline is rustc, I assume the "triangle number trick" is LLVM.
"The Implementation of Functional Programming Languages"
"Compiling with Continuations"
"Modern Compiler Implementation in ..." (C, Java and ML variants)
Although oriented towards Lisp, "Lisp in small pieces" is a classical book as well, with many optimizations like inlining of lambda calls across multiple call levels.
Just generally curious.
Why disappointed? It just highlights the quality of Rust's approach.
This happens all too often: once the code changes enough that the optimizer doesn't recognize the pattern anymore, it throws up its hands, and you're on your own. Some people call this optimizer roulette.
It's not just compilers, either. CPUs have their own peephole optimizers, and patterns they recognize, or don't, and it can easily make a 2x difference in your run time depending on if it cottons to what you're trying, or doesn't.
The days of Z80, 6502 and similar are long gone.