undefined | Better HN

0 pointsatleastoptimal9mo ago0 comments

>Go look at the referenced paper[0]. It is on page 3, last item in Figure 1, labeled "Simple Python code given spec and examples". That line is just after 2023 and goes to just after 2028. There's a dot representing the median opinion that's left of the vertical line half way between 2023 and 2028. Last I checked, 8-3 = 5, and 2025 < 2027.

The graph you're looking at is of the 2023 survey, not the 2022 one

As for your question, I don't see what it proves. You described the desired conditions for an a sorting algorithm and chatGPT implemented a sorting algorithm. In the case of an array with one element, it bypasses the for loop automatically and just returns the array. It is reasonable for it to assume all inputs are arrays because your question told it that its requirements were to create a program that " turn any list of numbers into a foobar."

Of course I'm not any one of the researchers asked about their predictions in the survey, but I'm sure if you told them "a SOTA AI in 2025 produced working human readable code based on a list of specifications, and is only incorrect by a broad characterization of what counts as an edge case that would trip up a reasonable human coder on the first try", I'm sure the 2022 or 2023 respondents would say that it meets their criteria for their threshold.

0 comments

godelski9mo ago

  > As for your question, I don't see what it proves.

The author made a claim

I showed the claim was false

The author bases his argument on this and similar claims. Showing his claim is false says he's argument doesn't hold

  > and is only incorrect by a broad characterization

I don't know of I'd really call a single item an "edge case" so much as generalization.

But I do know I'd answer that question differently given your reframing.

j / k navigate · click thread line to collapse