But really, if you spend that much time thinking about the performance, you really shouldn't have that abstracted into a function. Calling that costs cycles and blows out one entry in your return stack - which makes a more costly mispredict later on likely.
Why yes, yes I do miss fiddling around with low level details. :)