undefined | Better HN

0 pointsthfuran2y ago0 comments

I would rate a person who provides no sentence at all as performing significantly better, and I suspect most people could pretty quickly come up with something.

0 comments

nearbuy2y ago

> I would rate a person who provides no sentence at all as performing significantly better

Why?

> I suspect most people could pretty quickly come up with something

It only takes 60 seconds to test that on yourself. It's not that easy to come up with something of similar length to ChatGPT's answer that also sounds somewhat natural/sensible.

thfuranOP2y ago

>Why?

For the same reason that "I don't know" is generally a better response than bullshitting.

>It's not that easy to come up with something of similar length to ChatGPT's answer that also sounds somewhat natural/sensible

Those weren't requirements.

nearbuy2y ago

> Those weren't requirements.

Then it seems we don't disagree on anything concrete. You're just using a different rating system than me when I judge it as impressive compared to what an average person would produce in 60 seconds.

Not sure if this is a general principle of yours. If ChatGPT were able to write a 1000 word essay using all 5-letter words except for a single mistake, would you still find it unimpressive? Do you think it a tool or person who makes minor mistakes isn't useful? Or only when a tool/person makes major mistakes?

thfuranOP2y ago

ChatGPT wasn't asked to be impressive, it was asked to write a single sentence containing only five-letter words. I think that a tool that is unreliable is significantly less useful than a tool than is reliable and that, all other things being equal, a tool that fails in difficult to verify ways is less reliable than one that fails in easy to verify ways.

1 more reply

coldtea2y ago

>I would rate a person who provides no sentence at all as performing significantly better

The logic failure in the above statement is probably worse than the logic failure of not being able to spontaneously compose a phrase with just 5-letter words - and slipping in one or two with a higher word-count.

>I suspect most people could pretty quickly come up with something

You'd be very surprised then. Most people fail at even more basic tasks.

Heck, most candidate programmers fail at fizz-buzz (not that more difficult than the above)

thfuranOP2y ago

>The logic failure in the above statement

And which alleged logic failure is that?

coldtea2y ago

The idea that making a mistake but otherwise fulfilling most of the task is worse than failing to perform any part of it.

Especially in the context of "evaluating the performance of something".

Let's expand this a little to make it even more evident: if the task was "make a paragraph of 100 words using only 5 letter words" and an AI couldn't produce anything at all, whereas another came up with a paragraph of 100 words, except a couple of them had 6 or 4 letters, it would make absolutely no sense to rate the first as "better" than the second in performing the task.

As for understanding the task, the latter exhibits an understanding of it (since it produced a paragraph, and most of the words it used filled the criteria, which wouldn't happen if it chose them randomly), it just made a couple of mistakes (the kind of humans could easily make too in such a task). For the former we can't even be sure if it even understood the task at all.

We don't rate humans that way on performing tasks either (if they got it less than perfect it's worse than not doing it at all). Even math tests at the university level consider the approach and any partial results in the right direction, don't just mark it 0 if there's an error, nor give a higher mark to students who didn't produce anything.

thfuranOP2y ago

>The idea that making a mistake but otherwise fulfilling most of the task is worse than failing to perform any part of it.

The are many contexts in which correctness is important. In such contexts, an incorrect answer is often worse than an explicit non-answer.

>We don't rate humans that way on performing tasks either (if they got it less than perfect it's worse than not doing it at all). Even math tests at the university

Standardized tests often rate incorrect answers worse than non-answers, though yes a university maths test in particular isn't likely to be that sort of test.

j / k navigate · click thread line to collapse

0 comments

nearbuy2y ago

> I would rate a person who provides no sentence at all as performing significantly better

Why?

> I suspect most people could pretty quickly come up with something

It only takes 60 seconds to test that on yourself. It's not that easy to come up with something of similar length to ChatGPT's answer that also sounds somewhat natural/sensible.

thfuranOP2y ago

>Why?

For the same reason that "I don't know" is generally a better response than bullshitting.

>It's not that easy to come up with something of similar length to ChatGPT's answer that also sounds somewhat natural/sensible

Those weren't requirements.

nearbuy2y ago

> Those weren't requirements.

Then it seems we don't disagree on anything concrete. You're just using a different rating system than me when I judge it as impressive compared to what an average person would produce in 60 seconds.

thfuranOP2y ago

1 more reply

coldtea2y ago

>I would rate a person who provides no sentence at all as performing significantly better

>I suspect most people could pretty quickly come up with something

You'd be very surprised then. Most people fail at even more basic tasks.

Heck, most candidate programmers fail at fizz-buzz (not that more difficult than the above)

thfuranOP2y ago

>The logic failure in the above statement

And which alleged logic failure is that?

coldtea2y ago

The idea that making a mistake but otherwise fulfilling most of the task is worse than failing to perform any part of it.

Especially in the context of "evaluating the performance of something".

thfuranOP2y ago

>The idea that making a mistake but otherwise fulfilling most of the task is worse than failing to perform any part of it.

The are many contexts in which correctness is important. In such contexts, an incorrect answer is often worse than an explicit non-answer.

>We don't rate humans that way on performing tasks either (if they got it less than perfect it's worse than not doing it at all). Even math tests at the university

Standardized tests often rate incorrect answers worse than non-answers, though yes a university maths test in particular isn't likely to be that sort of test.

j / k navigate · click thread line to collapse