Since there are no publicly available details, I would guess this is solved in hardware. The Qualcomm processor can be made to support stronger memory model for translated instructions similar to how the Power processors support strong-access ordering for efficient x86 emulation[1].
[1] https://www.google.com/patents/WO2012101538A1?cl=en