Nvidia's demo of real-time object recognition using deep learning [video] (opens in new tab)

(youtube.com)

75 pointsvkhuc11y ago23 comments

23 comments

18 comments · 6 top-level

polskibus11y ago· 5 in thread

Looks great! Can anyone with more knowledge about deep learning say whether this is an exceptional achievement in the field ?

gamegoblin11y ago

As far as state-of-the-art classification goes, this is not terribly impressive. See the recent results in something like:

http://cs.stanford.edu/people/karpathy/deepimagesent/

I think the impressive thing here is that the GPU is presumably doing GIANT matrix multiplications in real time. A prediction from a neural net is just a series of matrix multiplications, and matrix multiplications are about n^2.8 in complexity, so you can see how matrix multiplications with thousands of rows/columns (often what these sorts of deep image classifiers involve) are hugely computationally expensive.

So it's definitely important for real time machine learning systems to have access to this kind of linear algebra power, but the actual machine learning techniques demonstrated are not super impressive. The hardware is. Which makes sense since this is an Nvidia demo.

dogma113811y ago

From what i can understand what's even more impressive is that it was running on a beefed up version of their latest mobile SOC and not on some 5000$ compute GPU card. Which means that this application can be both very affordable and very practical since people won't put a 300W GPU in their car.

2 more replies

davesque11y ago

I'm not an expert but, from what I do know, it seems like the take-away here is that it's running on tech which is within reach of most consumers. Sure, academics are accomplishing more impressive feats in the lab. However, it looks like nVidia has brought these algorithms onto hardware which is probably not much different from what they're already selling to gamers at a reasonable price. That could end up being a really big deal and could boost applications of computer vision in consumer tech significantly.

nl11y ago

Current state of the art is a bit better than this. See the bottom section of [1] for some of the latest publications.

However, building a real world working system has challenges that are different to the academic challenge of trying to classify the most classes possible in static images.

[1] http://blogs.technet.com/b/machinelearning/archive/2014/11/1...

exDM6911y ago

A few commenters already tell that this isn't really groundbreaking work, but how about for real time? And commodity hardware (this one is a few watt mobile chip)?

locusm11y ago· 3 in thread

The CEO butting in all the time was really annoying, had a business partner that did this in meetings all the time - its fucking annoying and rude. On the surface I have no idea if this is ground breaking or not, my first thought was ahh nVidia using Linux!

razster11y ago

Unfortunately these are the ones that make it to the top and are named CEO.

JabavuAdams11y ago

Yeah, and his colour commentary is subtly wrong, too.

ramy_d11y ago

Maybe because there was no rehearsal.

zwieback11y ago· 2 in thread

Cool demo but I still wonder if fundamentally this is just a brute-force approach. Wouldn't it be better to do some traditional preprocessing (e.g. recognizing rectangles, circles, etc.) and feeding higher-level descriptors into the classifier?

If the net learns based on pixels you still have to somehow solve rotation and scale invariance. Or is there something new in deep-learning vs. old-school neural nets that fixes the issues that bedeviled neural nets the first time they were popular?

vkhucOP11y ago

I think they used the methods described in http://www.cs.berkeley.edu/~rbg/papers/r-cnn-cvpr.pdf

zwieback11y ago

Thanks, interesting paper.

rasz_pl11y ago· 2 in thread

@10:08

on the right merc sls classified as SUV

on the left one SUV classified as two VANs

Their algorithm works at about 1Hz rate when doing signs. This is ~state of the art from 20 years ago, but running on small mobile SoC at a slow rate.

SammoJ11y ago

Please show a paper where fine-grained vehicle classification in unconstrained images is anywhere near this performance from 20 years ago. You will not be able to, because it wasn't.

rasz_pl11y ago

state of the art classification accuracy/range, not speed.

nl11y ago

Here's how to do the street sign part of this yourself: https://gist.github.com/iandees/f773749c47d088705199

plg11y ago

video shows demo happening in ubuntu --- at least the video playback

j / k navigate · click thread line to collapse

23 comments

18 comments · 6 top-level

polskibus11y ago· 5 in thread

Looks great! Can anyone with more knowledge about deep learning say whether this is an exceptional achievement in the field ?

gamegoblin11y ago

As far as state-of-the-art classification goes, this is not terribly impressive. See the recent results in something like:

http://cs.stanford.edu/people/karpathy/deepimagesent/

dogma113811y ago

2 more replies

davesque11y ago

nl11y ago

Current state of the art is a bit better than this. See the bottom section of [1] for some of the latest publications.

However, building a real world working system has challenges that are different to the academic challenge of trying to classify the most classes possible in static images.

[1] http://blogs.technet.com/b/machinelearning/archive/2014/11/1...

exDM6911y ago

A few commenters already tell that this isn't really groundbreaking work, but how about for real time? And commodity hardware (this one is a few watt mobile chip)?

locusm11y ago· 3 in thread

razster11y ago

Unfortunately these are the ones that make it to the top and are named CEO.

JabavuAdams11y ago

Yeah, and his colour commentary is subtly wrong, too.

ramy_d11y ago

Maybe because there was no rehearsal.

zwieback11y ago· 2 in thread

vkhucOP11y ago

I think they used the methods described in http://www.cs.berkeley.edu/~rbg/papers/r-cnn-cvpr.pdf

zwieback11y ago

Thanks, interesting paper.

rasz_pl11y ago· 2 in thread

@10:08

on the right merc sls classified as SUV

on the left one SUV classified as two VANs

Their algorithm works at about 1Hz rate when doing signs. This is ~state of the art from 20 years ago, but running on small mobile SoC at a slow rate.

SammoJ11y ago

Please show a paper where fine-grained vehicle classification in unconstrained images is anywhere near this performance from 20 years ago. You will not be able to, because it wasn't.

rasz_pl11y ago

state of the art classification accuracy/range, not speed.

nl11y ago

Here's how to do the street sign part of this yourself: https://gist.github.com/iandees/f773749c47d088705199

plg11y ago

video shows demo happening in ubuntu --- at least the video playback

j / k navigate · click thread line to collapse