I just finished reading the full analysis on GitHub.
> When thinking is deep, the model resolves contradictions internally before producing output.
> When thinking is shallow, contradictions surface in the output as visible self-corrections: "oh wait", "actually,", "let me reconsider", "hmm, actually", "no wait."
Yeah, THIS is something that I've seen happen a lot. Sometimes even on Opus with max effort.