The lean proof checker then checks to make sure the proof actually proves the statement.
In this case an AI is generating the proof, but if it "compiles" it's correct. The only thing humans need to check is the statement to be proven.
(I don't know anything about this project but I've played around with lean and used other proof checkers more sesrisously).
It seems one side of this argument desperately needs AI to have failed, and the other side is just saying that it probably worked but it is not as important as presented, that it is actually just a very cool working methodology going forward.