There's two major reasons to loop in humans.
1. How can we be sure ChatGPT knows it's correct or not? It gives out incorrect answers to complex questions all the time. The very fact that it gave out a correct answer is worth talking about.
2. The type of human that can verify a mathematical proof is also the type of human that knows the appropriate communication channels to let every other math-human know about the proof. The math-humans will know the impact that proof has on math, and how to apply it.