It's somewhat like taking rough pseudo code (the informal proof, a mixture of maths and English) and translating that into a bullet-proof production app (the formal proof, in Lean), where you're going to have to specify every step precisely traced back to axioms, handle all the edge causes, fix incorrect assumptions, and fill in the missing parts that were assumed to be straightforward but might not be.
A major part is you also have to formalise all the proofs your informal proof relied on so everything is traced back to the initial axioms e.g. you can't just cite Pythagorus theorem, you have to formalise that too.
So it's an order of magnitude more difficult to write a formal proof compared to an informal one, and even when you have the informal proof it can teams many years of effort.
But I thought it was a widely celebrated result.
That's part of the motivation to formalise it. When a proof gets really long and complex, relies on lots of other complex proofs, and there's barely any single expert who has enough knowledge to understand all the branches of maths it covers, there's more chance there's a mistake.
There's a few examples here of errors/gaps/flaws found while formalizing proofs that were patched up:
https://mathoverflow.net/questions/291158/proofs-shown-to-be...
My understanding is it's common to find a few problems that then need to be patched or worked around. It's a little like wanting to know if a huge codebase is bug-free or not: you might find some bugs if you formalized the code, but you can probably fix the bugs during the process because it's generally correct. There can be cases where it's not fixable though.
It takes thorough review by advanced mathematicians to verify correctness.
This is not unlike a code review.
Most people vastly underestimate how complex and esoteric modern research mathematics are.
That's actually where LLMs are already quite good at.
There is a pseudocode app that depends on a bunch of pseudocode libraries. They want to translate that pseudocode app into a real runnable app. They can do that, and it's a good amount of work, but reasonable. The problem is to get the app to run they also need to translate the hundreds or thousands of pseudocode libraries into actual libraries. Everything from OS APIs, networking libs, rendering libs, language standard libs all need ro be converted from specs and pseudocode to real code to actually run the app. And that's a ton of work.