If that's the situation you find yourself in, I cannot praise centralized logging with a good frontend highly enough because I frequently find myself trying to figure out what happened to a request, and it's like night and day.
Needing to ssh anywhere and run grep against log files is functional if there's only one or a handful of VMs, but it gets complicated with a handful of machines, and even just SCP-ing the logs off becomes time consuming if there are a lot of machines. Then once the logs are off, 'grep' quickly becomes inadequate. (And I should know, I've built some truly horrible regexps to try and grep for dates because I didn't know any better.)
All that friction means that answering the original question; figuring out a detailed internal reason for why my customer received a 500 http status response error, is just too toilsome for all but the most (as you noted) doesn't happen in .
With centralized logging, I'm able to search for a request ID and see the logs, and this is a reality as often as I need, in order to debug complex multi-system issues.