If you are seeing an agent missing tasks, work with it to write down the task list first and then hold it accountable to completing them all. A spec is not a plan.
I ask the model to rename MyClass to MyNewClass. It will generate a checklist like:
- Rename references in all source files
- Rename source/header files
- Update build files to point at new source files
Then it will do those things in that order.
Now you can re-run it but inject the start of the model's response with the order changed in that list. It will follow the new order. The list plainly provides real information that influences future predictions and isn't just a facade for the user.
Are you seriously saying that breaking a large complex problem down into it's constituent steps, and then trying to solve each one of them as an individual problem is just a sensation of rigour?
Edit: I'll give you another example that I realized because someone pointed it out here: when the stupid bot tells you why it fucked up, it doesn't actually understand anything about itself - it's just generating the most likely response given the enormous amount of pontification on the internet about this very subject...
Whist I can't usually start from the exact same point in the decisioning, I can usually bootstrap a new session. It's not all ephemeral.
To your edit: I find that the most galling thing about finding out about the thinking being discarded at cache clear. Reconstruction of the logical route it took to get to the end state is just not the same as the step by step process it took in the first place, which again I feel counters your "feelies".