LTO, as the name implies, means that you do another optimization pass at link time. This enables lots of optimizations that don't work when you look at one module at a time.
Functions can be inlined across module boundaries, even when they're not declared inline. You can turn virtual functions into regular functions, if you know that the virtual function is never overridden, or if you can derive the exact type. You can change calling conventions for functions. You can do better escape and aliasing analysis. If a function is only called once, then you can probably optimize it a lot better because you know exactly how it will be called.
As with all optimizations, not all programs will see any significant benefit. Programs with heavy inner loops like physics simulators and graphics processors will not see much benefit, since the local optimizer works well enough. Programs like compilers, interpreters, and web browsers will see larger benefits. However, the benefits can be high—30% improvements in running time are not unheard of.
In short, LTO is a high-cost, high-benefit optimization.