undefined | Better HN

0 pointsfwip4y ago0 comments

LLVM implements fewer optimization passes for ARM, so it's doing less work than compiling for x86.

0 comments

yjftsjthsd-h4y ago

Does that hurt performance of the output?

cute_boi4y ago

There is high probability that it can hurt runtime perf. Because less passes means less optimization (in general)

omaranto4y ago

Well, if it didn't, they could optimize the compiler by removing those optimizations. :)

KMag4y ago

I think it's pretty clear the GP was asking if the optimizations implemented for x86 that aren't implemented for aarch64 would actually improve performance of generated aarch64 code. It's a question about CPU architecture/microarchitecture. That's a different question as to if the optimizations improve performance of generated x86 code.

For instance, I imagine the x86_64 register allocation does some variant on graph coloring for register allocation, with an additional pass to assign lettered registers (rax, rbx, etc.) to the most heavily used registers, since using higher numbered registers requires a REX prefix byte. In addition, many instructions have more compact encodings when eax/rax is the destination register. At a minimum, excess REX prefixes take up instruction cache space. There's no parallel for aarch64, so there's no sense in implementing logic to try and make sure the low-numbered aarch64 registers are used more. (Though, on 32-bit ARM with Thumb/Thumb2, only a subset of registers are available, so there is a similar optimization for 32-bit ARM targets that support Thumb/Thumb2 when optimizing for space.)

I imagine there are better examples, but my point is that some optimizations are useless on some architectures.

1 more reply

makapuf4y ago

That was(is?) the benchmark for gcc optimizations: it has to pay for itself on the compiler: if the resulting compiler code is faster but the added compilation time is even greater, it s not worth it.

1 more reply

j / k navigate · click thread line to collapse

0 comments

yjftsjthsd-h4y ago

Does that hurt performance of the output?

cute_boi4y ago

There is high probability that it can hurt runtime perf. Because less passes means less optimization (in general)

omaranto4y ago

Well, if it didn't, they could optimize the compiler by removing those optimizations. :)

KMag4y ago

I imagine there are better examples, but my point is that some optimizations are useless on some architectures.

1 more reply

makapuf4y ago

1 more reply

j / k navigate · click thread line to collapse