I think the -fwrapv switch in GCC allows this to work properly. I don't actually know about this specific case, but -fwrapv makes signed and unsigned arithmetic to wrap around so that only the appropriate number of low bits are kept.
(I commonly use -fwrapv when writing programs in C.)