AI coding agents have made code generation nearly free, and they’ve shifted the bottleneck to code review. Static-only analysis with a fixed set of checkers isn’t enough. LLM-only review has several limitations: non-deterministic across runs, low recall on security issues, expensive at scale, and a tendency to get ‘distracted’.
We spent the last 6 years building a deterministic, static-analysis-only code review product. Earlier this year, we started thinking about this problem from the ground up and realized that static analysis solves key blind spots of LLM-only reviews. Over the past six months, we built a new ‘hybrid’ agent loop that uses static analysis and frontier AI agents together to outperform both static-only and LLM-only tools in finding and fixing code quality and security issues. Today, we’re opening it up publicly.
Here’s how the hybrid architecture works:
- Static pass: 5,000+ deterministic checkers (code quality, security, performance) establish a high-precision baseline. A sub-agent suppresses context-specific false positives.
- AI review: The agent reviews code with static findings as anchors. Has access to AST, data-flow graphs, control-flow, import graphs as tools, not just grep and usual shell commands.
- Remediation: Sub-agents generate fixes. Static harness validates all edits before emitting a clean git patch.
Static solves key LLM problems: non-determinism across runs, low recall on security issues (LLMs get distracted by style), and cost (static narrowing reduces prompt size and tool calls).
On the OpenSSF CVE Benchmark [1] (200+ real JS/TS vulnerabilities), we hit 81.2% accuracy and 80.0% F1; vs Cursor Bugbot (74.5% accuracy, 77.42% F1), Claude Code (71.5% accuracy, 62.99% F1), CodeRabbit (59.4% accuracy, 36.19% F1), and Semgrep CE (56.9% accuracy, 38.26% F1). On secrets detection, 92.8% F1; vs Gitleaks (75.6%), detect-secrets (64.1%), and TruffleHog (41.2%). We use our open-source classification model for this. [2]
Full methodology and how we evaluated each tool: https://autofix.bot/benchmarks
You can use Autofix Bot interactively on any repository using our TUI, as a plugin in Claude Code, or with our MCP on any compatible AI client (like OpenAI Codex).[3] We’re specifically building for AI coding agent-first workflows, so you can ask your agent to run Autofix Bot on every checkpoint autonomously.
Give us a shot today: https://autofix.bot. We’d love to hear any feedback!
---
[1] https://github.com/ossf-cve-benchmark/ossf-cve-benchmark
Traditional regex-based secrets scanners (Gitleaks, TruffleHog, detect-secrets) face a fundamental tradeoff: crank up sensitivity and drown in false positives flagging things like "YOUR_API_KEY_HERE", or tune it down and miss real credentials. We kept hearing from security teams that they couldn't trust their scanning tools because of the noise – developers would just ignore the alerts.
Regex is great at fast pattern matching, but terrible at understanding context. So instead of trying to make regex smarter, we built a hybrid system: regex does the initial high-recall sweep, then a fine-tuned 3B model filters out false positives by actually understanding the code context.
Technical approach: - Started with teacher-student architecture using DeepSeek R1 as teacher - Curated ~8K diverse secrets from Samsung's CredData dataset, relabeled for consistency - Generated synthetic edge cases using Gemini 2.5 Pro and Claude Sonnet 4 - Fine-tuned on ~900 examples with deterministic outputs (not chain-of-thought)
Integration is straightforward – run your existing regex tool, feed candidates to Narada with ±20 lines of context, get structured JSON output with true/false positive classification and reasoning.
We built this as part of Autofix Bot's secrets detection agent, and it outperformed static-only tools significantly in our benchmarks [2]. Figured the security community would benefit from having this available as an open-source building block. Would love to hear your feedback and learn what other edge cases you encounter.
[2] https://autofix.bot/benchmarks#benchmarks-secrets-detection
[3] https://autofix.bot/news/narada-secrets-detection-classifica...
After 5+ years of building AST-based static analyzers that process millions of lines of code daily at DeepSource, we kept hearing a common request from customers: "How do we write custom checks specific to our codebase?" AppSec and DevOps teams have a lot of learned anti-patterns and security rules they want to enforce across their orgs, and being able to do that without being a static analysis expert, came up as an important want.
We initially built an internal framework using tree-sitter [3] for our proprietary infrastructure-as-code analyzers, which enabled us to rapidly create new checkers. We realized that making the framework open-source could solve this problem for everyone.
Our key insight was that writing checkers isn't the hard part anymore. Modern AI assistants like ChatGPT and Claude are excellent at generating tree-sitter queries with very high accuracy. We realized that the tree-sitters' gnarly s-expression syntax isn’t a problem anymore (since the AI will be doing all the generation anyway), and we can instead focus on building a fast, flexible, and reliable checker runtime around it.
So instead of creating yet another DSL, we use tree-sitter's native query syntax. Yes, the expressions look more complex than simplified DSLs, but they give you direct access to your code's actual AST structure – which means your rules work exactly as you'd expect them to. When you need to debug a rule, you're working with the actual structure of your code, not an abstraction that might hide important details.
We've also designed Globstar to have a gradual learning curve: The YAML interface works well for simple checkers, and the Go Interface can handle complex scenarios when you need features like cross-file analysis, scope resolution, data flow analysis, and context awareness. The Go API gives you direct access to tree-sitter bindings, so you can write arbitrarily complex checkers on day one.
Key features:
- Written in Go with native tree-sitter bindings, distributed as a single binary
- MIT-licensed
- Write all your checkers in a “.globstar” folder in your repo, in YAML or Go, and just run “globstar check” without any build steps
- Multi-language support through tree-sitter (20+ languages today)
We have a long way to go and a very exciting roadmap for Globstar, and we’d love to hear your feedback!
[1] https://globstar.dev/guides/writing-yaml-checker