We built a small SDK that lint prompts before they ever hit an LLM. In practice it behaves like ESLint for prompts. It runs locally, no external calls, and flags issues that usually waste tokens or produce inconsistent outputs: unresolved template variables, missing contextual references, contradictory instructions, schema contamination when you expect structured output, and prompts that risk overrunning model context.
It exposes a single function in code, CLI for CI is in the works. The analyzer is language agnostic and fast enough to sit in any prompt generation pipeline, we aim for <50ms. There is also a small devserver with a React UI for experimenting interactively.
The goal is to treat prompts as first class artifacts and catch structural defects early rather than debugging after the fact. Happy to answer questions about heuristics, false positives, or how we estimate token overage.
All of it is open source under MIT, and we plan to keep expanding the issue set. We are also exploring a complementary prompt optimization layer that builds on top of the static analysis described above.
Happy to discuss details or help anyone experiment with it.