It's a bit different for reasoning LLMs - they operate in a feedback loop, measuring the quality of the solution and iterating on it until either the quality meets a desired threshold, or all reasoning effort is expended.
This can correct for generation errors, but cannot correct for quality measurement errors, so the question is valid.