undefined | Better HN

0 pointskenjackson3y ago0 comments

We benchmark humans with these tests -- why would we not do that for AIs?

The implications for society? We better up our game.

0 comments

jstx13y ago

> The implications for society? We better up our game.

If only the horses had worked harder, we would never have gotten cars and trains.

dragonwriter3y ago

> We benchmark humans with these tests – why would we not do that for AIs?

Because the correlation between the thing of interest and what the tests measure may be radically different for systems that are very much unlike humans in their architecture than they are for humans.

There’s an entire field about this in testing for humans (psychometry), and approximately zero on it for AIs. Blindly using human tests – which are proxy measures of harder-to-directly-assess figures of merit requiring significant calibration on humans to be valid for them – for anything else without appropriate calibration is good for generating headlines, but not for measuring anything that matters. (Except, I guess, the impact of human use of them for cheating on the human tests, which is not insignificant, but not generally what people trumpeting these measures focus on.)

kenjacksonOP3y ago

There is also a lot of work in benchmarking for AI as well. This is where things like Resnet come from.

But the point of using these tests for AI is precisely the reason we use for giving them to humans -- we think we know what it measures. AI is not intended to be a computation engine or a number crunching machine. It is intended to do things that historically required "human intelligence".

If there are better tests of human intelligence, I think that the AI community would be very interested in learning about them.

See: https://github.com/openai/evals

credit_guy3y ago

> The implications for society? We better up our game.

For how long can we better up our game? GPT-4 comes less than half a year after ChatGPT. What will come in 5 years? What will come in 50?

layer83y ago

Progress is not linear. It comes in phases and boosts. We’ll have to wait and see.

PaulDavisThe1st3y ago

Check on the curve for flight speed sometime, and see what you think of that, and what you would have thought of it during the initial era of powered flight.

credit_guy3y ago

Powered flight certainly progressed for decades before hitting a ceiling. At least 5 decades.

With GPT bots, the technology is only 6 years old. I can easily see it progressing for at least one decade.

PaulDavisThe1st3y ago

Maybe a different analogy will make my point better. Compare rocket technology with jet engine technology. Both continued to progress across a vaguely comparable time period, but at no point was one a substitute for the other except in some highly specialized (mostly military-related) cases. It is very clear that language models are very good at something. But are they, to use the analogy, the rocket engine or the jet engine?

Kaibeezy3y ago

Exponential rise to limit (fine) or limitless exponential increase (worrying).

layer83y ago

Without exponential increase in computing resources (which will reach physical limits fairly quickly), exponential increase in AI won’t last long.

adgjlsfhk13y ago

I don't think this is a given. Over the past 2 decades, chess engines have improved more from software than hardware.

1 more reply

pwinnski3y ago

Expecting progress to be linear is a fallacy in thinking.

kenjacksonOP3y ago

Sometimes it's exponential. Sometimes it's sublinear.

pwinnski3y ago

Sometimes it's exponential over very short periods. The fallacy is in thinking that will continue.

scotty793y ago

We should take better care of humans who are already obsolete or soon become obsolete.

Because so far we are good only at criminalizing and incarcerating or killing them.

awb3y ago

Upping our game will probably mean an embedded interface with AI. Something like Neurolonk.

atlasunshrugged3y ago

Not sure if an intentional misspelling but I think I like Neurolonk more

UberFly3y ago

Eventually there will spring up a religious cult of AI devotees and they might as well pray to Neurolonk.

awb3y ago

Lol, unintentional

alluro23y ago

I know it's pretty low level on my part, but I was amused and laughed much more than I care to admit when I read NEUROLONK. Thanks for that!

j / k navigate · click thread line to collapse

0 comments

jstx13y ago

> The implications for society? We better up our game.

If only the horses had worked harder, we would never have gotten cars and trains.

dragonwriter3y ago

> We benchmark humans with these tests – why would we not do that for AIs?

kenjacksonOP3y ago

There is also a lot of work in benchmarking for AI as well. This is where things like Resnet come from.

If there are better tests of human intelligence, I think that the AI community would be very interested in learning about them.

See: https://github.com/openai/evals

credit_guy3y ago

> The implications for society? We better up our game.

For how long can we better up our game? GPT-4 comes less than half a year after ChatGPT. What will come in 5 years? What will come in 50?

layer83y ago

Progress is not linear. It comes in phases and boosts. We’ll have to wait and see.

PaulDavisThe1st3y ago

Check on the curve for flight speed sometime, and see what you think of that, and what you would have thought of it during the initial era of powered flight.

credit_guy3y ago

Powered flight certainly progressed for decades before hitting a ceiling. At least 5 decades.

With GPT bots, the technology is only 6 years old. I can easily see it progressing for at least one decade.

PaulDavisThe1st3y ago

Kaibeezy3y ago

Exponential rise to limit (fine) or limitless exponential increase (worrying).

layer83y ago

Without exponential increase in computing resources (which will reach physical limits fairly quickly), exponential increase in AI won’t last long.

adgjlsfhk13y ago

I don't think this is a given. Over the past 2 decades, chess engines have improved more from software than hardware.

1 more reply

pwinnski3y ago

Expecting progress to be linear is a fallacy in thinking.

kenjacksonOP3y ago

Sometimes it's exponential. Sometimes it's sublinear.

pwinnski3y ago

Sometimes it's exponential over very short periods. The fallacy is in thinking that will continue.

scotty793y ago

We should take better care of humans who are already obsolete or soon become obsolete.

Because so far we are good only at criminalizing and incarcerating or killing them.

awb3y ago

Upping our game will probably mean an embedded interface with AI. Something like Neurolonk.

atlasunshrugged3y ago

Not sure if an intentional misspelling but I think I like Neurolonk more

UberFly3y ago

Eventually there will spring up a religious cult of AI devotees and they might as well pray to Neurolonk.

awb3y ago

Lol, unintentional

alluro23y ago

I know it's pretty low level on my part, but I was amused and laughed much more than I care to admit when I read NEUROLONK. Thanks for that!

j / k navigate · click thread line to collapse