Yeah, that's a great option!
Conceptually it's roughly what's going on anyway - something must wrap at some point if we're going to store our offsets in limited space - just that we get the wrapping for free from overflow if it's 2^n.
The confusion above stemmed, I think, from the fact that in the "original" implementation the mask is used for that wrapping and then we have a noop projection from offset to index. In this implementation, overflow is used for that wrapping and the mask is to project from offset to index. In the implementation you discuss, we pick still other functions for both.