Past researcher in pure math here. The big problem is that mathematicians are notorious for not providing self-contained proofs of anything because there is no incentive to do so and authors sometimes even seem proud to "skip the details". What actually ends up happening is that if you want a rigorous proof that can be followed theoretically by every logical step, you actually need an expert to fill in a bunch of gaps that simply can't easily be found in the literature. It's only when such a person writes a book explaining everything that it might be possible, and sometimes not even then.
The truth is, a lot of modern math is on shaky ground when it comes to stuff written down.
Math research papers are written for other specialists in the field. Sometimes too few details are provided; indeed I commonly gripe about this when asked to do peer review; but to truly provide all the details would make papers far, far longer.
Here is an elementary example, which could be worked out by anyone with a good background in high school math. Prove the following statement: there exist constants C, X > 0 such that for every real number x > X, we have
log(x^2 + 1) + sqrt(x) + x/exp(sqrt(4x + 3)) < Cx.
i.e. the left-side is big-O of the right.
This flavor of statement comes up all the time in analytic number theory (my subject), would be regarded as obvious to any specialist, and in a paper would be invariably stated without proof.
To come up with a complete, rigorous proof would be long, and not interesting. No professional would want to read one.
I agree this attitude comes with some tradeoffs, but they do seem manageable.
It is technically resolvable. Just insist on Lean (or some other formal verifier) proofs for everything.
It looks to me like math is heading that way, but a lot of mathematicians will have to die before its the accepted practice.
Mathematicians who are already doing formal proofs are discovering it have the same properties as shared computer code. Those properties have have lead to most of the code executed on the planet being open source source, despite the fact you can't make money directly from it.
Code is easy to share, easy to collaborate on, and in the case of formal proofs easy to trust. It is tedious to write, but collaboration is so easy that publishing your 1/2 done work will often prompt others to do some of the tedious stuff. Code isn't self documenting unless it's very well written, but even horrible code is far better at documenting what it does than what you are describing.
This is not an easy change at all. Some subfields of math rely on a huge amount of prereqs, that all have to be formalized before current research in that field can insist on formal proofs for everything as a matter of course. This is the problem that the Lean mathlib folks are trying to address, and it's a whole lot of work. The OP discusses a fraction of that work - namely, formalizing every prereq for modern proofs of FLT - that is expected to take several years on its own.
> Code is easy to share, easy to collaborate on [...] collaboration is so easy that publishing your 1/2 done work will often prompt others to do some of the tedious stuff
Here's an experiment anyone can try at home: pick a random article from the mathematics arxiv [1]. Now rewrite the main theorem from that paper in Lean [2]. Did you find this task "easy"? Would you go out of your way to finish the "tedious" stuff?
> even horrible code is far better at documenting what it does than what you are describing
The "documentation" is provided by talking to other researchers in the field. If you don't understand some portion of a proof, you talk to someone about it. That is a far more efficient use of time than writing code for things that are (relatively) obviously true. (No shade on anyone who wants to write proofs in Lean, though.)
---
[1] https://arxiv.org/list/math/new
[2] https://lean-lang.org/theorem_proving_in_lean4/dependent_typ...
Lean is too inflexible for this, in my opinion. Maybe I'm not dreaming big enough, but I think there'll have to be one more big idea to make this possible; I think the typeclass inference systems we use these days are a severe bottleneck, for one, and I think it's very, very tedious to state some things in Lean (my go-to example is the finite-dimensionality of modular forms of level 1 - the contour integral is a bitch and has a lot of technicalities)
I think what we need in math is that eventually, a collection of frequently cited and closely related papers should be rewritten into a much longer book format. Of course some people do that, but they hardly get much credit for it. It should be much more incentivated, and a core part of grant proposals.
combine lean prove language with inline English human language that explains it. Then, Lean files can be fed into Lean to check the proof, and LaTeX files can be fed into LaTeX to produce the book that explains it all to us mere mortals.
>> log(x^2 + 1) + sqrt(x) + x/exp(sqrt(4x + 3)) < Cx.
These problems are only "uninteresting" to the extent that they can be "proven" by automated computation. So the interesting part of the problem is to write some completely formalized equivalent to a CAS (computer algebra system - think Mathematica or Maple, although these are not at all free from errors or bugs!) that might be able to dispatch these questions easily. Formalization systems can already simplify expressions in rings or fields (i.e. do routine school-level algebra), and answering some of these questions about limits or asymptotics is roughly comparable.
In particular, I'm not making any sort of logical claim -- rather, I know many of the people who read and write these papers, and I have a pretty good idea of their taste in mathematical writing.
Too many papers follow this pattern, especially for the more tedious parts. The two problems are that it makes the lives of students and postdocs torture, and that the experts tend to agree without sufficient scrutiny of the details (and the two problems are intrinsically connected: the surviving students are "trained" to accept those kinds of leaps and proliferate the practice).
Frustratingly, I am often being told that this pattern is necessary because otherwise papers will be enormous---like I am some kind of blithering idiot who doesn't know how easily papers can explode in length. Of course we cannot let papers explode in length, but there are ways to tame the length of papers without resorting to nonsense like the above. For example, the above snippet, abstracted from an actual paper, can be converted to a proof verifiable by postdocs in two short paragraphs with some effort (that I went through).
The real motive behind those objections is that authors would need to take significantly more time to write a proper paper, and even worse, we would need actual editors (gasp!) from journals to perform non-trivial work.
It would be annoying to have to code that kind of statement every time, but it wouldn't add that much to the effort.
Formalizing is still enormously hard, but I don't think that this is the example to hang the argument on.
Another way, just using a cas (like maxima) to compute such limit: (%i1) limit( (log(x^2 + 1) + sqrt(x) + x/exp(sqrt(4*x + 3)))/x,x,inf); the result is zero.
An important lemma we need is that if f(x) is continuous on [a,b] and f'(x) > 0 on (a,b), then f(a) < f(b). For this, we need to write out the mean value thoerem, which needs Rolle's theorem, which needs the extreme value theorem, which gets into the definition of the real numbers. As well as all the fun epsilon–delta arguments for the limit manipulations, of course.
x/exp(sqrt(4x+3)) < x thankfully follows trivially from exp(sqrt(4x+3)) > 1 = exp(0). Since sqrt(4x+3) > 0, we just need to show that exp(z) is increasing. For this, we can treat the exponential as the anti-logarithm (since that's how my high school textbook did it), then show that log(z) is increasing, which follows from log'(z) = 1/z > 0 and our lemma.
For log(x^2+1) < x, we'll want to use our lemma on f(z) = z - log(z^2+1) to show that f(x) > f(0) = 0. Here, f'(z) = 1 - 2x/(x^2+1), for which we need the chain rule and the power rule (or a specialized epsilon–delta argument). Since x^2+1 > 0, f'(z) > 0 is implied by (x^2+1) - 2x > 0, which luckily factors into the trivial (x-1)^2 > 0. So ultimately, we break the lemma up into (0,1) and (1,x) to avoid trouble with the stationary point.
The trouble with problems like these is the whole foundation that the obvious lemmas have to be built upon. "Just look at the graph, of course it's increasing" isn't a rigorous proof. Of course, if you want to do this seriously, then you go and build that foundation, and on top of that you probably define some helpful lemmas for the ladder of asymptotic growth rates. But all of the steps must be written out at some point or another.
I don't get why this is true. If C is 1 and x is 2
log(x^2+1) sqrt(2) + x/exp(sqrt(4*2 + 3)) is not less than 2.
WTF
>i.e. the left-side is big-O of the right.
Oh.
Sounds like Tao's third stage, of informed intuition.
So, I agree with the author that this is super helpful, even if we know the proofs are "true" in the end
For the prof, yeah, easily observable. What about the students who try to absorb that particular article? You already have to balance the main topic in your brain, and you get these extra distractions on top of it.
IIUC you're saying that basically someone with a mental database should be able to identify preconditions matching some theorem and postconditions aligned with the next statement from their mental database.
Or is there math that can't be expressed in a way for proof checkers to assess atm?
Edit: Or maybe proof checker use just isn't as wide-spread as I imagined. It sounds like the state statically typed languages in programming...
This is essentially exactly what Mathlib[1] is, which is Lean's database of mathematics, and which large portions of the FLT project will likely continually contribute into.
[1]: https://github.com/leanprover-community/mathlib4/
(Other theorem provers have similar libraries, e.g. Coq's is called math-comp: https://math-comp.github.io/ )
I.e. a widely accepted proof that had a critical flaw due to hand waving?
If it hasn't, I would understand why they'd be cavalier about explicit details.
Well, I never was much of a number theorist. I never did come to understand the basic definitions behind the BSD conjecture. Number theory is so old, so deep, that writing a PhD on the topic is the step one takes to become a novice. Where I say that I didn't understand the definitions, I certainly knew them and understood the notation. But there's a depth of intuition that I never arrived at. So the uproar of experts, angry that I had the audacity to hope for a counterexample, left me more curious than shaken: what do they see, that they cannot yet put words to?
I am delighted by these advances in formalism. It makes the field feel infinitely more approachable, as I was a programmer long before I called myself a mathematician, and programming is still my "native tongue." To the engineers despairing at this story, take it from me: this article shows that our anxiety at the perceived lack of formalism is justified, but we must remember that anxiety is a feeling -- and the proper response to that feeling is curiosity, not avoidance.
Also not a number theorist...but I'd bet those so-called experts had invested far, far too many of their man-years in that unproven conjecture. All of which effort and edifice would collapse into the dumpster if some snot-nosed little upstart like you, using crude computation, achieved overnight fame by finding a counter-example.
(If I could give my many-decades-ago younger self some advice for math grad school, one bit of that would be: For any non-trivial "Prove X" assignment, start by spending at least 1/4 of my time budget looking for counter-examples. For academic assignments, that's 99% likely to fail. But the insight you'll get into the problem by trying will be more worth it. And the other 1% of the time you'll look like a genius. And - as soon as you attempt real math research, those odds shift enormously, in favor of the counterexample-first approach.)
Not at all. In fact, if I had found a counterexample, it would cause a flurry of new research to quantify exactly how wrong the BSD conjecture is. Such a finding would actually be a boon to their career! That's why my response is curiosity, and not to sneer at them for protecting their extremely secure tenured careers.
Edit 1: And if you think you've found a counterexample to a long-standing conjecture with a computation, you'd better be damned sure that your computation is correct before opening your mouth in public. And that takes a ton of work in the case of the BSD conjecture, because you've almost certainly identified a bug in the extremely complex code underlying that computation. If I ever thought I was holding onto such a counterexample, I'd approach a human calculator like Ralph Greenberg as my first step (after internal checks: re-running the code on another computer to rule out cosmic bit flips, and perhaps running more naive, unoptimized implementations).
Edit 2: This attitude pervades my software development career, and I've brought it to my foray into superconducting circuit design: a bug report brings joy to my life, and I aim to shower the reporter with praise (which may involve chocolate). There is nothing more satisfying than being proven wrong, because it helps us collectively move toward greater truths.
No one knows how to compute the size of that group in general (in fact no one has proved that it's finite!). Computing the rank of a curve via non-analytic means is more akin to a bespoke proof than a straightforward computation (see Noam Elkies' work).
So saying you're going to disprove BSD with blind computation is rather naive unless you're sitting on several career-defining proofs and not sharing them.
Say I show up to a physics conference and proclaim that I hope my computational effort will disprove some major physical law. Well, you better have a good delivery, or the joke might not land well!
Sometimes people take things a little too literally after attending hours of very dry talks at serious seminars. I wouldn't read too much into it.
Except it wouldn't, because the work towards the BSD would still be right and applicable to other problems. If someone proved the Riemann hypothesis false, all of our math (and there is a lot of it) surrounding the problem isn't immediately made worthless. The same is true for any mathematical conjecture.
I don't doubt the rest of your comment might have played a role, however.
Are you maybe confusing math academia for psychology or social sciences? There is no replication crisis in math, no house of cards of self-proclaimed experts riding on bullshit. Mathematicians are _actually experts_ at a deep and extremely rigorous technical field -- many of them are even experts at computational approaches to problems! -- and when outsiders and upstarts resolve old conjectures, mathematicians generally react by celebrating them and showering them with fame, job offers and gushing articles in Quanta.
I am curious what a group of number theorists "going wild" looks like.. were they throwing chairs or are we talking slightly raised eyebrows?
Beautiful sentiment, also in other areas of life, thank you.
Oh well, hopefully when they get the machines to solve math, they’ll still want them to run a couple percents faster every year.
However, unlike board games where the concepts can be explained to you in a few minutes (usually it becomes clear only when you play), a lot mathematics especially algebraic geometry/number-theory can have many layers of dependencies which takes years to absorb.
It would be interesting to compare it to understanding a large open source project like the Linux kernel, well enough to contribute. I would say it is not so conceptually deep as mathematics of the article (while still having a few great ideas). But understanding the source would require familiarization with 'tedious details' which incidentally, what this article is also about.
So the issue, stated this way, is not so much raw talent as time and effort. This leads to the topic of motivation - finding something great in an idea can lead to investment in the subject. For those more talented, the journey might be easier.
Alan Kay's maxim is crucial - a change of perspective can simplify things relative to 80 extra IQ points. A long sequence of technical steps can be compressed into a single good intuitive definition/idea like the complexity of navigating a path becomes clear from a higher viewpoint.
Grothendieck, considered by many to be the best mathematician of the past century, made the point that there were several colleagues who were more proficient at symbolic manipulation. But, he was able to look at things with a more foundational perspective and could make great contributions.
Here's a good essay by Thurston "Proof and Progress in mathematics". https://arxiv.org/pdf/math/9404236
He discusses this problem of being intimidated by jargon.
And yes, the desire for formalism has been stronger in the descendants than the ancestors for some time, but we're seeing that gap close in real time and it's truly exciting.
And, lets keep pushing the frontier of computation! We aren't just making better ad-delivery devices.
My bet: there is a very long list of skills for which the gap between you and a person randomly picked out the whole population is very large, and not in the direction you seem to think.
It makes me think about how you get UI/UX/web designers who make (informal, imprecise) mockups and prototypes of their design ideas and interaction flows, they then hand these off to a developer to do the coding (formalise it, explain it precisely to the machine), and on the way the developer will inevitably find problems/holes in the designs (like an interaction scenario or code path not considered in the design, which could uncover a major design flaw), which the developer and/or the designer will have to patch somehow.
As in, design and development are different roles and require different mindsets, and most designers have a very strong resistance against working and thinking like a developer.
This was already tried, and failed (Hilbert). In the aftermath of the failure we learned that mathematics cannot be completely formalized. So this points to a fundamental problem with using AI to do math.
Nor is this in any way "a fundamental problem with using AI to do math".
In cases where the Hilbert machine applies, the mathematician writing down the semi-formal argument already has to state the new axioms, justify them, and reason about how they apply, and the proposal would just take that "reason about how they apply" step and write it in a computer-verifiable way.
No it doesn't, AI like an LLM has no issues with stating two inconsistent propositions, just like humans, which is why both AI and humans can reason about all of mathematics, eg. Godel theorems is that a formal system cannot simultaneously be complete and consistent.
Who likely underestimated how hard it would be.
And who didn't have a 2024 computer on hand.
I've always wondered if this intuition was really true. Would it really be so impossible for a whole branch of mathematics to be developed based on a flawed proof and turn out to be simply false?
Similarly, once in a while math gets applied and is used to make predictions. When the math is wrong, those predictions are wrong. And those wrong predictions draw a lot of attention.
It would be a very strange situation if the foundation fell apart logically without any ramifications in the rest of this "knot". A huge swathe of free-floating mathematics that's completely internally consistent but for this one error? Difficult to imagine for me in the case of cohomology from the article.
I guess strictly speaking this is more of a philosophical stance - I like to believe that a lot of the current mathematics has been discovered "naturally" in some sense =)
Example: https://github.com/ImperialCollegeLondon/FLT/blob/main/FLT/M...
Also check out the blueprint, which describes the overall structure of the code:
https://imperialcollegelondon.github.io/FLT/blueprint/
I'm very much an outside observer, but it is super interesting to see what Lean code looks like and how people contribute to it. Great thing is that there's no need for unittests (or, in some sense, the final proof statement is the unittest) :P
In such a project theorems and proofs are "the main point of the software". The unit tests make sure certain things don't go wrong by noticing when developers, e.g., mess up while refacing something. Also, people actually put the things I was talking about in a folder called "test"...
All of which is a long way of saying the line "the old-fashioned 1990s proof" makes me feel _really_ old.
Much worse, this nonchalant attitude is being taught to PhD students and postdocs both explicitly and implicitly: if you are worried too much, maybe you are not smart enough to understand the arguments/your mathematical sense is not good enough to perceive the essence of the work. If you explain too much, your readers will think you think they are dumb; erase this page from your paper (actual referee feedback).
Also, like Loeffler in the comments, I don't trust the "people have been using crystalline cohomology forever without trouble" argument. The basics are correct, yes, as far as I can tell (because I verified them myself, bearing in mind of course that I am very fallible).
But precisely because of that, large swathes of the theory will be correct. Errors will be rare and circumstantial, and that is part of the problem! It makes them very easy to creep into a line of work and go unnoticed for a long time, especially if the expert community of the area is tiny---as is the case in most sub-areas of research math.
Are you aware of the book on [The Disc Embedding Theorem](https://academic.oup.com/book/43693) based on 12 lectures Freedman gave roughly a decade ago.
Edit: I see from the MO comments "The fully topological version of the disc embedding theorem is beyond the scope of this book, since we will not discuss Quinn's proof of transversality."
I don't remember the details of the story (I read surely your joking years ago), and remember being amazed by how much time that policy must have cost him.
But now I wonder that he didn't hit dozens or hundreds of errors over the years.
There's a good popsci comms video on why here https://youtu.be/TwKpj2ISQAc?si=bpZOBy9WBGQzi6sk
I guess the same can be said about many other "fanboi", and have little to do with the facts in the book
Tongue-in-cheek irritation aside, this is actually awesome. I think Lean (and other theorem provers) are going to end up being important tools in math going forward.
Many of my troubles probably come from the fact that I only have a BSc in maths and that I'm not very familiar with Lean/mathlib and don't have anyone guiding me (although I did ask some questions in the very helpful Zulip community). Many results in mathlib are stated in rather abstract ways and it can be hard to figure out how they relate to certain standard undergraduate theorems - or whether those are in mathlib at all. This certainly makes sense for the research maths community, but it was definitely a stumbling block for me (and probably would be one if Lean were used more in teaching - but this is something that could be sorted out given more time).
In terms of proof automation, I believe we're not there yet. There are too many things that are absolutely harder to prove than they should be (although I'm sure that there's also a lot of tricks that I'm just not aware of). My biggest gripe concerns casts, in "regular" mathematics, the real numbers are a subset of the complex numbers and so things that are true for all complex numbers are automatically true for all reals[0], but in Lean they're different types with an injective map / cast operation and there is a lot of back-and-forth conversion that has to be done and muddies the essence of the proof, especially when you have "stacks" of casts, e.g. a natural number cast to a real cast to a complex number etc.
Of course, this is somewhat specific to the subject, I imagine that in other areas, e.g. algebra, dealing with explicit maps is much more natural.
[0] This is technically only true for sentences without existential quantifiers.
The issue with stacked casts is mostly solved by the `norm_cast` tactic. Again, ask more questions on Zulip - even if you don't ask about this in particular, if you suggest it in passing, or your code gives indications of an unnecessarily complicated proof style, you will get suggestions about tactics you may not be aware of.
One way you can focus a question like this if you don't know what techniques to use but just have a feeling that formalization is too hard, is to isolate an example where you really had to work hard to get a proof and your proof is unsatisfying to you, and challenge people to golf it down. These kind of questions are generally well received and everyone learns a lot from them.
I can usually find a way around such things but it does make proofs more tedious. As said, I'm sure I'm not using Lean optimally, but I wouldn't know what to ask for specifically, it's a case of "unknown unknowns".
> One way you can focus a question like this if you don't know what techniques to use but just have a feeling that formalization is too hard, is to isolate an example where you really had to work hard to get a proof and your proof is unsatisfying to you, and challenge people to golf it down.
Fair, that's something I could try. For example, my proof that cos 2 ≤ -1/3 (which is used for showing that cos has a zero in (0,2) with which I can define pi) is unreasonably complicated, while the proof on paper is just a couple of lines. There are a bunch of techniques used when estimating series, such as (re)grouping terms, that I found difficult to do rigorously.
Okay, for some decades, I've read, written, taught, applied, and published, in total, quite a lot of math. Got a Ph.D. in applied math.
Yes, there are problems in writing math, that is, some math is poorly written.
But, some math is quite nicely written. (1) Of course, at least define every symbol before using it. (2) It helps to motivate some math before presenting it. (3) Sometimes intuitive statments can help.
For more, carefully reading some well-written math can help learning how to write math well:
Paul R.\ Halmos, {\it Finite-Dimensional Vector Spaces, Second Edition,\/} D.\ Van Nostrand Company, Inc., Princeton, New Jersey, 1958.\ \
R.\ Creighton Buck, {\it Advanced Calculus,\/} McGraw-Hill, New York, 1956.\ \
Tom M.\ Apostol, {\it Mathematical Analysis: Second Edition,\/} ISBN 0-201-00288-4, Addison-Wesley, Reading, Massachusetts, 1974.\ \
H.\ L.\ Royden, {\it Real Analysis: Second Edition,\/} Macmillan, New York, 1971.\ \
Walter Rudin, {\it Real and Complex Analysis,\/} ISBN 07-054232-5, McGraw-Hill, New York, 1966.\ \
Leo Breiman, {\it Probability,\/} ISBN 0-89871-296-3, SIAM, Philadelphia, 1992.\ \
Jacques Neveu, {\it Mathematical Foundations of the Calculus of Probability,\/} Holden-Day, San Francisco, 1965.\ \
But now they had to find a suitable replacement to underpin the field.
By the way, I found an excellent word for when a proof is disproven or found to be faulty, but that is esoteric enough that it has less risk of being misinterpreted to mean the conclusion is proven false: 'vitiated'. The conclusion might still be true, it just needs a new or repaired proof; the initial proof is 'vitiated'. I like how the word sounds, too.
So I have wondered about PdF's possible thinking.
Now, the degenerate case of n=2 is just the Pythagorean Theorem
c^2 = a^2 + b^2
and we now know from AW's proof that the cubic and beyond fail.Now, the PT is successful because the square b^2 can be "squished" over the corner of a^2, leaving a perfect square c^2. [Let's let the a^n part be the bigger of the two.]
5^2 = 4^2 + 3^2
25 = 16 + 9
25 = 16 + (4 + 4 + 1)
Each of the 4's go on the sides, and the 1 goes on the corner, leaving a pure 5x5 square left over.Now, for cubes, we now know that a cube cannot be "squished" onto another cube's corner in such a way that makes a bigger cube.
I'm not up for breaking out my (diminishing) algebra right now (as it's a brain off day), but that b^3 cube would need to break down into three flat sides, three linear edges, and the corner.
This fits my physical intuition of the problem and seems to me to be a possible way that PdF might have visualized the cubic form. Now, I have zero mathematical intuition about such things (or necessary math skills either), but the physical nature of the problem, plus the fact that we now know that the cubic and beyond don't work, leads me to wonder if this is an interesting approach. It also makes me curious to know why it isn't, if that is indeed the case, (which I assume is probable).
As to the n=4 and beyond, I would just hand-wave them away and say something like, "Well, of course they are more fractally impossible", which by that I mean that a real mathematician would have to say exactly why the n=3 case failing means that the n>=4 case would also not work. (My guess here is that there just becomes less and less possibility of matching up higher-dimensional versions of a two-term addition.)
Anyway, just a thought I had some years ago. I even built a lego model of the bits of cube spreading out along the corner of the a^3 cube.
I would enjoy learning why I'm so utterly wrong about this, but I found it a fun mental exercise some years ago that's been swimming around in my brain.
Thanks in advance if there are any takers around here. :-)
[And, my favorite Horizon episode is 'Freak Wave'.]
So it is more likely that Fermat saw n=4, and thought the rest would be similar.
One of the many reasons I love math is the feeling I've always had that with it, we're building skyscraper-sized mental structures that are built on provably indestructible foundations.
With the advent of "modern" proofs, which involve large teams of Mathematicians who produce proof that are multiple hundreds of pages long and only understandable by a very, very tiny sliver of humanity (with the extreme case of Shinichi Mochizuki [1] where N=1), I honestly felt that math had lost its way.
Examples: graph coloring theorem, Fermat's last theorem, finite group classification theorem ... all of them with gigantic proofs, out of reach of most casual observers.
And lo and behold, someone found shaky stuff in a long proof, what a surprise.
But it looks like some Mathematicians feel the same way I do and have decided to do something about it by relying on the computers. Way to go guys !