Here's some links about it
* https://www.youtube.com/watch?v=lTgERgPNTF8 a 5 minutes talk about llvm doing this
* https://speice.io/2019/02/compiler-optimizations.html discusses this happening in Rust (the Rust compiler uses llvm)
* http://www.cs.cmu.edu/afs/cs.cmu.edu/user/jatina/www/CS_15_7... a 12 page paper, also on llvm (ps: gcc can do it too, it was just easier to find llvm sources)
* https://en.wikipedia.org/wiki/Escape_analysis#Optimizations (the optimization that enables this is called escape analysis: it tells whether the heap allocation "escapes" the current function, or if it's local enough that stack allocation is a good fit. inlining increases the likelihood that heap allocation won't escape)
* https://stackoverflow.com/questions/47075871/can-the-compile... asking if compilers are actually allowed to do that (they sometimes are, but not always)
You don't really want to rewrite the bit indexing, but it feels like you're paying a lot of cruft just to avoid a few “x / 8”.