Why does it need to be checked on a per-request level?
I'd expect you to be able to give short-lived capability tokens to clients that each machine can verify down the stack without making new rpcs. This would avoid the fan-out of all the internal services.
Is it just to prevent abuse?