undefined | Better HN

0 pointsgolol1y ago0 comments

I believe you are misreading this.

First of all, this is not a sport and the point is not to compare AI to humans. The point is to compare AI to IMO-difficulty problems.

Secondly, this is now some hacky trick where Brute force and some theorem prover magic are massaged to solve a select few problems and then you'll never hear about it again. They are building a general pipeline which turns informal natural lamguage mathematics (of which we have ungodly amounts available) into formalized mathematics, and in addition trains a model to prove such kinds of mathematics. This can also work for theory building. This can become a real mathematical assistant that can help a mathematician test an argument, play with variations of a definition, try 100 combinations of some estimates, apply a classic but lengthy technique etc. etc.

0 comments

6 comments · 2 top-level

lolinder1y ago· 2 in thread

> First of all, this is not a sport and the point is not to compare AI to humans. The point is to compare AI to IMO-difficulty problems.

If this were the case then the headline would be "AI solves 4/6 IMO 2024 problems", it wouldn't be claiming "silver-medal standard". Medals are generally awarded by comparison to other contestants, not to the challenges overcome.

> This can become a real mathematical assistant that can help a mathematician test an argument, play with variations of a definition, try 100 combinations of some estimates, apply a classic but lengthy technique etc. etc.

This is great, and I'm not complaining about what the team is working on, I'm complaining about how it's being sold. Headlines like these from lab press releases will feed the AI hype in counterproductive ways. The NYT literally has a headline right now: "Move Over Mathematicians, Here Comes AlphaProof".

gololOP1y ago

At the IMO "silver medal" afaik is define as some tange of points, which more or less equals some range of problems solved. For me it is fair to say that "silver-medal performance" is IMO langauge for about 4/6 problems solved. And what's the problem if some clickbait websites totally spin the result? They would've done it anyways even with a different title, and I also don't see the harm. Let people be wrong.

mathnmusic1y ago

No, "silver medal" is defined as a range of points to be earned in the allotted time (4.5 hours for both papers of 3 problems each).

1 more reply

riku_iki1y ago· 2 in thread

> They are building a general pipeline which turns informal natural lamguage mathematics

but this part currently sucks, because they didn't trust it and formalized problems manually.

gololOP1y ago

Yea that's fair, but I don't think it will keep sucking forever as formalization is in principle just a translation process.

riku_iki1y ago

and we don't have 100% accuracy in translation in ambiguous texts, because system often need some domain knowledge, context etc. And math has 0% tolerance to mistakes.

I also expect that math formalized by machine will be readable by machine and hardly understandable by humans.

j / k navigate · click thread line to collapse

0 comments

6 comments · 2 top-level

lolinder1y ago· 2 in thread

> First of all, this is not a sport and the point is not to compare AI to humans. The point is to compare AI to IMO-difficulty problems.

gololOP1y ago

mathnmusic1y ago

No, "silver medal" is defined as a range of points to be earned in the allotted time (4.5 hours for both papers of 3 problems each).

1 more reply

riku_iki1y ago· 2 in thread

> They are building a general pipeline which turns informal natural lamguage mathematics

but this part currently sucks, because they didn't trust it and formalized problems manually.

gololOP1y ago

Yea that's fair, but I don't think it will keep sucking forever as formalization is in principle just a translation process.

riku_iki1y ago

and we don't have 100% accuracy in translation in ambiguous texts, because system often need some domain knowledge, context etc. And math has 0% tolerance to mistakes.

I also expect that math formalized by machine will be readable by machine and hardly understandable by humans.

j / k navigate · click thread line to collapse