At the same time... that's not why I'm comfortable writing this. It's pretty obvious when you know what good vs bad feels like here and adjust accordingly:
1. Good: You are able to generate a long plan and that plan mostly works. These are big wins _as long as you are multitasking_: you are high throughput, even if the AI is slow. Think running 5-20min at a time for pretty good progress, for just a few minutes of your planning that you'd largely have to do anyways.
2. Bad: You are wasting a lot of attention chatting (so 1-2min runs) and repairing (re-planning from the top, vs progressing). There is no multitasking win.
It's pretty clear what situation you're in, with run duration on its own being a ~10X level difference.
Ex: I'll have ~3 projects going at the same time, and/or whatever else I'm doing. I'm not interacting "much" so I know it's a win. If a project is requiring interaction, well, now I need to jump in, and it's no longer agentic coding IMO, but chat assistant stuff.
At the same time, I power through case #2 in practice because we're investing in AI automation. We're retooling everything to enable long runs, so we'll still do the "hard" tasks via AI to identify & smooth the bumps. Similar to infrastructure-as-code and SDLC tooling, we're investing in automating as much of our stack as we can, so that means we figure out prompt templates, CI tooling, etc to enable the AI to do these so we can benefit later.