Anecdata, I have also successfully used it to unearth complicated bugs that I already knew must exist, but didn’t have a good way to find.
Think of a query like: Find all calls to function A that have an output pointer-pointer of type B in the last argument position and also have a boolean return type, verify the input dereferenced B is NULL, then verify that iff A assigns output to B that the return type is false and that the calling frame also includes a later call to function C to clean up B. This type of thing run on millions of lines of code.
This can get as crazy as you want it to be and did work, but there is a “but.” The query language is so verbose and powerful, there were many, many ways to represent the same query that all had drastically different performance profiles. The docs were woefully underwhelming in that regard — they simply stated what a member did, but not anything at all related to memory or performance implications. That, coupled with the fact that nearly every complex query during dev hit the wall of the JVM killing it for taking too much time/mem, it became apparent that they must have tooling for the tooling to analyze performance and memory profiles of queries (not unlike all the effort put into SQL over the years). Also, no debugger; resorted to “datalog printf debugging” (ie, including internal clauses as outputs and chopping off later parts of the query...)
I spent a few weeks only writing queries and in every non-trivial one I needed a senior person there to review/rearrange a few clauses to go from e.g. 30 minute runtime to 15 seconds. That left me with a feeling that I was constantly fighting the docs and lack of tooling and would constantly need their help to tune things. Nothing was fundamentally wrong with the queries — it was just not documented anywhere that some filtering should be done caller->down vs. other kinds of checks in the same query should be callee->up.
I had questions regarding scaling up to 1B+ LoC like others in the thread but didn’t really get that far.
It did successfully find the C/C++ bugs I was looking for once the queries had hours of investment from both sides, and we were also able to find bugs in a large custom JS codebase by mocking all the things it would need to understand to eval the code. Whether or not that investment makes sense is an individual/team/org question, but if they ever seriously invest in a query debugger and profiling, they’ll be pretty hard to beat.