undefined | Better HN

0 pointsYeGoblynQueenne1mo ago0 comments

These are different cases, yes? The person in the SA article you link is described as an "amateur", but Timothy Gowers is not an amateur and he is much more capable of guiding an LLM with domain expertise than an amateur.

Then there's the kind of problem we're talking about. The "amateur" in the SA article solved one of Erdős problems and Gowers himself seems to think that, on its own, is not a cause for concern. He distinguishes his own result from that kind of earlier result at the start of his article:

>> The background is that, as has been widely reported, LLMs are now capable of solving research-level problems, and have managed to solve several of the Erdős problems listed on Thomas Bloom’s wonderful website. Initially it was possible to laugh this off: many of the “solutions” consisted in the LLM noticing that the problem had an answer sitting there in the literature already, or could be very easily deduced from known results.

So we have an "amateur" who "vibe-solved" an Erdős problem, on one hand, which may or may not already had a solutiuon lurking in the wings on the one hand; and an expert who solved a harder problem by interactive use rather than vibe-solving, on the other hand. There's no reason to believe that we can "Replace chatgpt with a human in both of these stories" as you say.

And btw there's scholarship that indicates vibe-solving is not yet ready to replace mathematicians like Timothy Gowers:

First Proof

To assess the ability of current AI systems to correctly answer research-level mathematics questions, we share a set of ten math questions which have arisen naturally in the research process of the authors. The questions had not been shared publicly until now; the answers are known to the authors of the questions but will remain encrypted for a short time.

https://arxiv.org/abs/2602.05192

See Appendix A for initial results.

0 comments

1 comments · 1 top-level

famouswaffles1mo ago

Yes these are different instances.

My first point is that I think you are overating 'interactive use' a bit here. Like Timothy already explains in the article, Were it a human he 'guided' in a similar way, he would not get credit for those achievements by any stretch of the imagination. And I think that's an important part of realizing why these sort of people are beginning to discuss these things.

Second. I didn't say anything about models being ready to replace mathematics wholesale. But should people really wait until that happens before discussing it? I know it's human nature to wait until the problem or situation is upon you but I don't think that would be prudent or wise. And even just for the sake of curiosity, it would be boring.

I think the matter of fact here is that in the last few months with the last few models, capabilities in this area have jumped to a very meaningful degree. It would be stranger if no one was talking about it.

j / k navigate · click thread line to collapse

0 comments

1 comments · 1 top-level

famouswaffles1mo ago

Yes these are different instances.

j / k navigate · click thread line to collapse