Compilers for embedded processors aren't bad at all. But architectures vary heavily, and the subject of this article is a system that tunes optimizations automatically, instead of having engineers tune the optimizations themselves.
Also, note that for embedded systems low-level performance is typically a more important concern than for non-embedded systems. You typically don't care whether a function in your web or desktop application ends up getting inlined or not. It can make a significant difference in an embedded system.