Well, it's not quite a problem with whether the operations are commutative - both of those phrasings of the problem will give the correct answer in hardware 100% of the time. The only difference is one made an efficient decision about the order of operations with knowledge of how bit growth rules work and the ranges of the inputs.
You do have the same problem in hardware, which is hardware designers job. The difference RTLs don't claim to be high level languages. This is an instance where there's a high level intention and a low level implementation, and the high level synthesis tool has just ported new language constructs into the high level language to do low level optimisation, rather than actually doing the synthesis optimisations that are expected.