Interesting use case, didn't occur to me that tail calls can also just be a performance optimisation technique to help out the compiler and branch predictor. I assumed hot loops could be implemented just as well using GOTOs, but maybe not?
Everything can be implemented using IF statements and GOTOs. That's how early processors worked, and Turing completeness and all. We don't _really_ need function calls, while loops, or for loops either.
In the particular linked case of protobuf parsing, a loop with goto's doesn't produce very well optimized code because of specific internal details about how modern C compilers do optimizations. You could certainly imagine a compiler that can fully optimize a go-to heavy program, in which case the code cleanliness argument would be the only reason.
Tail calls are basically GOTOs, yeah. The bring a HUGE benefit of very clearly defining state flow between gotos which makes the compiler's job super easy.