I've also posted a few more comments in this twitter thread: https://twitter.com/felixge/status/1571850160358965249
The Go CPU profiler is great for reducing CPU utilization. But unless you're CPU-bound, it's not very useful for improving latency. fgtrace is trying to help with that.
Huh, I wonder if this is a temporary limitation or an issue with the approach. In my experience if you're doing profiling you probably better off getting something lighter weight that you can get more honest numbers from.
Edit: reading closer, it looks like the go team had similar concerns. I wonder if this can capture how long a goroutine was unmounted for.
The bigger problem is capturing the stack traces for all goroutines. Rhys added a patch to Go 1.19 [1] that mostly moves this work outside of the critical STW section, which greatly reduces the overhead. Unfortunately this improvement only applies to the official goroutine profiling APIs, and those do not provide details such as goroutine ids. This means fgtrace has to use runtime.Stack() which returns the stack traces as text (yikes) and isn't optimized like the other goroutine profiling APIs.
There are various ways the implementation details of fgtrace and the Go runtime could be improved for this use case (wallclock timeline views), and I'm hoping to work on contributions in the coming months.
He notes the performance & scalability issues already noted here by other commenters.
> Probably the right thing to do is figure out more of a trace like the current trace profiles but perhaps less low level.
This is the key take away for me.
I think there's room for tracing support somewhere in-between runtime/trace and full blown distributed traces (e.g., OpenTelemetry[1]) - so I'm hopeful this effort may evolve into a good solution in that space.
From a usability point of view, my biggest gripe right now with the go tracer is that it's viewer is...painful. It uses the tracer that's built into chrome, which chrome itself is moving away from.
I'd hacked around a bit recently to try and get the existing go traces into perfetto[2], with some success. As I recall, I couldn't get user traces functioning.
The `go tool trace` server has an api to output compatible json, but it's limited in what it outputs. Unfortunately, the trace file itself is in some custom binary format. All the tools for manipulating it are in `internal/` folders, making them unavailable for import, so creating new tools for working with the traces is quite burdensome.
I'd debated copying the code out into a new project, and starting to hack on it, but at that point, I'd reached the end of my willingness to invest time. Perhaps I should open an issue or mesage the mailing list to see what the maintainers think the future of runtime tracing looks like.
[0] https://github.com/golang/go/issues/41324#issuecomment-70379...
Go 1.19 has made some improvements in this regard [1]. But yes, profiling all goroutines does not scale to programs that use more than perhaps 10k goroutines which isn't entirely uncommon. To overcome this, the goroutine profile API would need to be extended to allow profiling a subset of goroutines. pprof labels could be used to specify which goroutines should be profiled.
> Probably the right thing to do is figure out more of a trace like the current trace profiles but perhaps less low level.
Yeah, in the long run the tracer, perhaps in combination with the cpu profiler [2], also offers a great way of capturing this data. But right now it's too much of a firehose, so it probably needs some way of selecting a subset of goroutines to trace as well. Additionally the unwinding of stack traces is a major bottleneck, so maybe frame pointer unwinding or similar will be needed to make it faster.
I've heard some stuff about future plans to the tracer that would help with the custom binary format problem, so hopefully this will improve in the future.
Anyway, I mostly see fgtrace as a "Do Things that Don't Scale" [3] kind of project. If people like the value it can provide, it will likely motivate myself and others to figure out how to build a version of it that is safe for production usage :).
[1] https://go-review.googlesource.com/c/go/+/387415