You hit the nail on the head; that is absolutely the biggest bottleneck.
Right now, I am using Python's multiprocessing to parallelize the commit traversal, and the scanner actively ignores standard binary and media file extensions to keep memory overhead in check. On mid-sized repositories, it holds up nicely. However, on massive monorepos with years of heavy history, it will definitely lag behind compiled Go tools.
To mitigate this for daily workflows, I added a --depth flag so developers can limit the scan to the last N commits (e.g., just checking their current feature branch history before pushing). Profiling and optimizing the traversal tree for massive repos is my next major architectural focus.