I understand (and don’t know what to say beyond “that’s weird”). I was just reacting to your equating “the first try sometimes doesn’t compile” with “it often fails”. That’s only a failure if you stop at that point.
In the first comment you’ll see that it didn’t compile, I prompted to fix it and eventually fixed it myself. When I asked it to proceed to the next task it blew away my fix and left it in a broken state.