Also, I have noticed that she takes simple things we don't usually think deep about and drill down to it opening up underlying complexity and pointing a metaphorical magnifying glass at it. (An example I quickly picked from the blog http://jvns.ca/blog/2014/09/28/how-does-sqlite-work-part-1-p...)
I've done this to debug OpenStack in the past and it worked very well. There are many similar projects for Python, I used this one since it's in the RHEL repo.
Useful summary on StackOverflow: https://stackoverflow.com/questions/4163964/python-is-it-pos...
That has been good enough for the problem of "wtf is this ruby process doing for _minutes_ at a time?" That doesn't get you flame graphs, but you can take a few snapshots and get an idea of what is happening.
For more involved perf debugging I've used ruby-prof.
“I'm constantly surprised by how many people don't know you can do this. It's amazing.”
I'm probably nitpicking, but sad to see this in the article. One of the things I love about Julia's writing otherwise is that it is free of this sort of 'I'm surprised that you don't know this simple thing.' expressions.
Linux-only seems perfectly acceptable, especially as more and more development moves to Docker containers.
Great write up, looking forward to seeing a completed tool!
I think this is why you shouldn't run this on production server itself. Each call is very resource intensive on the production server.
I believe the right way to analyze memory is to use "gcore", dump the memory, download it to the local machine's VM instance that's running the same OS as the production using scp. Also download the same ruby binary that production is running, and use gdb on the VM to analyze memory dump.
And that's exactly what this blog post is about.