There's a fair amount of talk right now about the value being in the verification layer -- once there's a hard verification loop, the agents can do amazing things without getting (permanently) sidetracked. I think what you're working on is half way there -- in essence, you're probably relying on the LLMs notion of what a spec is and should be to the codebase.
What's not currently solved, and what I think is very interesting is how much automation can be added to the creation of verification. We all would unlock a lot more speed and productivity for even moderate gains on that side.