I work on 1M LOC 15 yr old repo. Like you it's across the full stack. Bugs in certain pieces of complex business logic would have catastrophic consequences for my employer. Basically I peel poorly-specific work items off my queue into its own worktree and session at high reasoning/effort and provide a well-specified prompt.
These things eat into my supervision budget:
* LLM loses the plot and I have to nudge (like you)
* Thinking hard to better specify prompts (like you)
* Reviewing all changes (I do not vibe code except for spikes or other low-risk areas)
* Manual thing I have to do (for things I have not yet automated with a agent-authored scripts)
* Meetings
* etc
So, yes, my supervision budget is a bottleneck. I can only run 5-8 agents at a time because I have only so much time in the day.
Compare that vs a single agent at high reasoning/effort: I am sitting waiting for it to think. Waiting for it to find the code area I'm talking about takes time. Compiling, running tests, fixing compile errors. A million other things.
Any time I find myself sitting and waiting, this is a signal to me to switch to a different session.