undefined | Better HN

0 pointslhorie4y ago0 comments

Wasn't there a thread here just yesterday about how 6% of some class of AI outperformed a human, but then it turned out that 0% outperformed two humans? That's also literally the lesson Uber learned the hard way when a SDV ran over a person (that zero humans is worse than one, and one is worse than two). This is also the principle behind code review, peer review, QA, middle management bureaucracy, and a whole lot of other things.

The tragedy, IMHO, is that AI models like this encourage centralizing decision making into a single black box (to the extent that external research then benefits the owner of the AI model rather than advancing public commons), whereas in pretty much every other aspect of life, we consider decentralization/redundancy of autonomy to be the solution to robustness problems.

0 comments

6gvONxR4sf7o4y ago

A common quip is that most benchmarks are of the performance of humans who aren’t really paying attention (because while building datasets, they’re doing this repetitive task over and over and over). So better than the average human benchmark isn’t generally great.

j / k navigate · click thread line to collapse

0 comments

6gvONxR4sf7o4y ago

j / k navigate · click thread line to collapse