undefined | Better HN

0 pointsmeasurablefunc5mo ago0 comments

You're missing the point. There is no evidence to support their claims which means they are more than likely leaking the memory into the LLM prompt & it is cheating by simply loading constants into memory instead of computing anything. This is why formal specifications are used to constrain optimization. Without proof that the code is equivalent you might as well just load constants into memory & claim victory.

0 comments

3 comments · 1 top-level

fc417fc8025mo ago· 2 in thread

> There is no evidence to support their claims

Do you make a habit of not presuming even basic competence? You believe that Anthropic left the task running for hours, got a score back, and never bothered to examine the solution? Not even out of curiosity?

Also if it was cheating you'd expect the final score to be unbelievably low. Unless you also suppose that the LLM actively attempted to deceive the human reviewers by adding extra code to burn (approximately the correct number of) cycles.

measurablefuncOP5mo ago

This has nothing to do w/ me & consistently making it a personal problem instead of addressing the claims is a common tactic for people who do not know what it means to present evidence for their claims. Anthropic has not provided the necessary evidence for me to conclude that their LLM is not cheating. I have no opinion on their competence b/c that is not what is at issue. They could be incompetent & not notice that their LLM is cheating at their take home exam but I don't care about that.

fc417fc8025mo ago

You are implying that you believe them to be incompetent since otherwise you would not expect evidence in this instance. They also haven't provided independent verification of their claims - do you suspect them of lying as well?

How do you explain the specific score that was achieved if as you suggest the LLM simply copied the answer directly?

1 more reply

j / k navigate · click thread line to collapse