undefined | Better HN

0 pointsfragmede8y ago0 comments

> take a request ID, grep the logs, and get an entire story of what happened to that request. In reality, this rarely happens!

If that's the situation you find yourself in, I cannot praise centralized logging with a good frontend highly enough because I frequently find myself trying to figure out what happened to a request, and it's like night and day.

Needing to ssh anywhere and run grep against log files is functional if there's only one or a handful of VMs, but it gets complicated with a handful of machines, and even just SCP-ing the logs off becomes time consuming if there are a lot of machines. Then once the logs are off, 'grep' quickly becomes inadequate. (And I should know, I've built some truly horrible regexps to try and grep for dates because I didn't know any better.)

All that friction means that answering the original question; figuring out a detailed internal reason for why my customer received a 500 http status response error, is just too toilsome for all but the most (as you noted) doesn't happen in .

With centralized logging, I'm able to search for a request ID and see the logs, and this is a reality as often as I need, in order to debug complex multi-system issues.

0 comments

2 comments · 2 top-level

girvo8y ago

Word of advice: the 'jq' tool for handling JSON files (couple with a glob like '*.log' or something fancier with xargs or parallel) will absolutely save your bacon in those situations. It's way more powerful than it appears on the surface.

We had a series of Docker json-file driver log files. It's done as a raw list (no array around it) of JSON objects -- which is a bit annoying to sort and filter based on properties of the objects.

'jq '[inputs]' (asterisk).log > combined.json' was my favourite command today; it combines all the files inputs and wraps them in an array correctly. No awk needed!

Combine that with its cute:

jq '.someProp as $var | test("some search"; "gi") as $r | if $r then ($var + $__loc__) else null end' (asterisk).log | grep -v "^null$" > filtered.json

And you're away to the races. Can then load the file directly in and group_by(.somePath) and it will all magically work!

Edit: had to remove the actual asterix symbols as they screw with formatting but are used for globbing the file names. Replace with the real character

ldng8y ago

True but even with a centralized logging system, if the logs are not good enough you can find yourself still wondering what the hell happened. Grep here is just the tool to extract the "story".

j / k navigate · click thread line to collapse