undefined | Better HN

0 pointsYeGoblynQueenne1y ago0 comments

"Designed" is not right. What gives "AI models" (i.e. deep neural nets) a hard time is that there are very few examples in the public training and evaluation set: each task has three examples. So basically it's not a test of intelligence but a test of sample efficiency.

Besides which, it is unfair because it excludes an entire category of systems, not to mention a dominant one. If F. Chollet really believes ARC is a test of intelligence, then why not provide enough examples for deep nets or some other big data approach to be trained effectively? The answer is: because a big data approach would then easily beat the test. But if the test can be beaten without intelligence, just with data, then it's not a test of intelligence.

My guess for a long time has been that ARC will fall just like the Winograd Schema challenge (WSC) [1] fell: someone will do the work to generate enough (tens of thousands) examples of ARC-like tasks, then train a deep neural net and go to town. That's what happened with the WSC. A large dataset of Winograd schema sentences was crowd-sourced and a big BERT-era Transformer got around 90% accuracy on the WSC [2]. Bye bye WSC, and any wishful thinking about Winograd schemas requiring human intuition and other undefined stuff.

Or, ARC might go the way of the Bongard Problems [3]: the original 100 problems by Bongard still stand unsolved, but the machine learning community has effectively sidestepped them. Someone made a generator of Bongard-like problems [4], and while this was not enough to solve the original problems, everyone simply switched to training CNNs and reporting results on the new dataset [5].

We basically have no idea how to create a test for intelligence that computers cannot beat by brute force or big data approaches so we have no effective way to test computers for (artificial) intelligence. The only thing we know humans can do that computers can't is identify undecidable problems (like Barber Paradoxes i.e. statements of the form "this sentence is false", as in Gödel's second incompleteness theorem). Unfortunately we already know there is no computer that can ever do that, and even if we observe say ChatGPT returning the right answer we can be sure it has only memorised, not calculated it, so we're a bit stuck. ARC won't get us unstuck in any way shape or form and so it's just a distraction.

_____________________

[1] https://en.wikipedia.org/wiki/Winograd_schema_challenge

[2] WinoGrande: An Adversarial Winograd Schema Challenge at Scale

https://arxiv.org/abs/1907.10641

Although note the results are interpreted to mean LLMs are more or less memorising answers, which is right of course.

[3] Index of Bongard Problems

https://www.foundalis.com/res/bps/bpidx.htm

[4] Comparing machines and humans on a visual categorization test

https://www.pnas.org/doi/abs/10.1073/pnas.1109168108

[5] 25 years of CNNs: Can we compare to human abstraction capabilities?

https://arxiv.org/abs/1607.08366

0 comments

8 comments · 3 top-level

falcor841y ago· 2 in thread

> My guess for a long time has been that ARC will fall just like the Winograd Schema challenge (WSC) fell: someone will do the work to generate enough (tens of thousands) examples of ARC-like tasks, then train a deep neural net and go to town.

I think that this would be the real AGI (or even superintelligence hurdle) - having essentially a metacognitive AI understand that something given to it is a novel problem, for which it would use the given examples to automatically generate synthetic data sets and then train itself (or a subordinate model) based on these examples to gain the skill of solving this general type of problem.

> The only thing we know humans can do that computers can't is identify undecidable problems (like Barber Paradoxes i.e. statements of the form "this sentence is false", as in Gödel's second incompleteness theorem). Unfortunately we already know there is no computer that can ever do that

Where did the "ever" come from? Why wouldn't future computers be able to do this at (least at) a human level?

YeGoblynQueenneOP1y ago

The "ever" comes from the Church-Turing thesis. Maybe in the future computers will not be Turing machines, but that we can't know yet.

falcor841y ago

That refers to the unsolvability of the general case, which is of course also unsolvable by humans.

2 more replies

famouswaffles1y ago· 2 in thread

>We basically have no idea how to create a test for intelligence that computers cannot beat by brute force or big data approaches.

Agreed

>so we have no effective way to test computers for (artificial) intelligence.

I never quite understand stances like this considering evolutionary human intelligence is exactly the consequence of incredible brute force and scale. Why is the introduction of brute force suddenly something that means we cannot 'truly' test for intelligence in machines ?

YeGoblynQueenneOP1y ago

This is my entire quote:

>> We basically have no idea how to create a test for intelligence that computers cannot beat by brute force or big data approaches so we have no effective way to test computers for (artificial) intelligence.

When I say "brute force" I mean an exhaustive search of some large search space, in real time, not in evolutionary time. For example, searching a very large database for an answer, rather than computing the answer. But, as usual, I don't understand the point you're trying to make and where the bit about evolution came from. Can you clarify?

Btw, three requests, so we can have a productive conversation with as little time wasted in misunderstandings as possible:

a) Don't Fisk me (https://www.urbandictionary.com/define.php?term=Fisking).

b) Don't quote my words out of context.

c) If you don't understand why I say something, just ask.

famouswaffles1y ago

Ok. I guess i misunderstood you then. I didn't mean to quote you out of context.

I just meant the human brain is the result of brute force. Evolution is a dumb biological optimizer whose objective function is to procreate. It's not search exactly but well then neither is brute force of Modern NNs.

1 more reply

visarga1y ago· 1 in thread

> "Designed" is not right. What gives "AI models" (i.e. deep neural nets) a hard time is that there are very few examples in the public training and evaluation set

No, he actually made a list of cognitive skills humans have and is targeting them in the benchmark. The list of "Core Knowledge Priors" contains Object cohesion, Object persistence, Object influence via contact, Goal-directedness, Numbers and counting, Basic geometry and topology. The dataset is fit for human ease of solving, but targets areas hard for AI.

> "A typical human can solve most of the ARC evaluation set without any practice or verbal explanations. Crucially, to the best of our knowledge, ARC does not appear to be approachable by any existing machine learning technique (including Deep Learning), due to its focus on broad generalization and few-shot learning, as well as the fact that the evaluation set only features tasks that do not appear in the training set."

YeGoblynQueenneOP1y ago

Thanks, I know about the core knowledge priors, and François Chollet's claims about them (I've read his white paper, although it was long, and long-winded and I don't remember most of it). The empirical observation however is that none of the systems that have positive performance on ARC, on Kaggle or the new leaderboard, have anything to do with core knowledge priors. Which means core knowledge priors are not needed to solve any of the so-far solved ARC tasks.

I think Chollet is making a syllogistic error:

  a) Humans have core knowledge priors and can solve ARC tasks
  b) Some machine X can solve ARC tasks
  c) Therefore machine X has core knowledge priors

That doesn't follow; and like I say it is refuted by empirical observations, to boot. This is particularly so for his claim that ARC "does not appear approachable" (what) by deep learning. Plenty of neural-net based systems on the ARC-AGI leaderboard.

There's also no reason to assume that core knowledge priors present any particular difficulty to computers (i.e. that they're "hard for AI"). The problem seems to be more with the ability of humans to formalise them precisely enough to be programmed into a computer. That's not a computer problem, it's a human problem. But that's common in AI. For example, we don't know how to hand-code an image classifier; but we can train very accurate ones with deep neural nets. That doesn't mean computers aren't good at image classification: they are; CNNs to the proof. It's humans who suck at coding it. Except nobody's insisting on image classification datasets with only three or four training examples for each class, so it was possible to develop those powerful deep neural net classifiers. Chollet's choice to only allow very few training examples is creating an artificial data bottleneck that does not restrict anyone in the real world so it tells us nothing about the true capabilities of deep neural nets.

Cthulhu. I never imagined I'd end up defending deep neural nets...

I have to say this: Chollet annoys me mightily. Every time I hear him speak, he makes gigantic statements about what intelligence is, and how to create it artificially, as if he knows what dozens of thousands of researchers in biology, cognitive science, psychology, neuroscience, AI, and who knows what other field, don't. That is despite the fact that he has created just as many intelligent machines as everyone else so far, which is to say: zero. Where that self-confidence comes from, I have no idea, but the results on his "AIQ test" indicate he, just like everyone else, has no clue what intelligence is, yet he persists with the absurd self-assurance. Insufferable arrogance.

Apologies for the rant.

j / k navigate · click thread line to collapse

0 comments

8 comments · 3 top-level

falcor841y ago· 2 in thread

Where did the "ever" come from? Why wouldn't future computers be able to do this at (least at) a human level?

YeGoblynQueenneOP1y ago

The "ever" comes from the Church-Turing thesis. Maybe in the future computers will not be Turing machines, but that we can't know yet.

falcor841y ago

That refers to the unsolvability of the general case, which is of course also unsolvable by humans.

2 more replies

famouswaffles1y ago· 2 in thread

>We basically have no idea how to create a test for intelligence that computers cannot beat by brute force or big data approaches.

Agreed

>so we have no effective way to test computers for (artificial) intelligence.

YeGoblynQueenneOP1y ago

This is my entire quote:

Btw, three requests, so we can have a productive conversation with as little time wasted in misunderstandings as possible:

a) Don't Fisk me (https://www.urbandictionary.com/define.php?term=Fisking).

b) Don't quote my words out of context.

c) If you don't understand why I say something, just ask.

famouswaffles1y ago

Ok. I guess i misunderstood you then. I didn't mean to quote you out of context.

1 more reply

visarga1y ago· 1 in thread

> "Designed" is not right. What gives "AI models" (i.e. deep neural nets) a hard time is that there are very few examples in the public training and evaluation set

YeGoblynQueenneOP1y ago

I think Chollet is making a syllogistic error:

  a) Humans have core knowledge priors and can solve ARC tasks
  b) Some machine X can solve ARC tasks
  c) Therefore machine X has core knowledge priors

Cthulhu. I never imagined I'd end up defending deep neural nets...

Apologies for the rant.

j / k navigate · click thread line to collapse