The point is that if you know the algorithm will produce X as the output if the input is Y, give that as a tool to Claude
And if you know that the previous algorithm completes in Z milliseconds, tell Claude that too and give it a tool (a command it can run) to benchmark its implementation.
This way you don't need to tell it what it did wrong, it'll check itself.
It was the other way around. Claude gave me an algorithm. I found it fishy. So I specifically constructed a counterexample in response to Claude’s algorithm.
Of course when I gave that to Claude, Claude changed the algorithm. But if I didn’t have enough experience and CS fundamentals to find it fishy in the first place, why would I construct a counterexample?