That's a bold claim. As far as I know there was one paper that reported a model beating human scores in a specific test (imagenet, I believe). Whether that translates to "superhuman" results in general is followed by a very big question mark.
In general I really struggle to see how any algorithm that learns from examples, especially one that minimises a measure of error against further examples, can ever have better performance than the entities that actually compiled those examples in the first place (in other words, humans).
I'm saying: how is it possible to learn superhuman performance in anything from examples of mere human performance at the same task? I don't believe in magic.
Second the examples were produced by scraping Flickr. Then mechanical turkers were asked to confirm if the object was in the image or not.
There are many images that are kind of ambigious, or contain multiple objects, so humans don't do perfectly. One researcher tried to estimate human performance, and got about 5%. Which has been beaten by computers now, by a lot.
I'm not contesting the fact that it's surprising and overall a sign of progress. I'm contesting the claim that it demonstrates "superhuman" performance.
By analogy, a good student at a bad school is "superhuman" because he or she got a good mark in an exam that most other pupils _in that school_ failed. You gotta go a lot further than that before you put on the red cape.
Computers could be better at assigning probabilities to ambiguous examples. In particular, for an image that is very ambiguous for most humans, maybe a computer would assign 99% probability to it (hence it would be only a little bit ambiguous).
Besides, I have no idea whether the people who tagged Imagenet are the "average human", nor whether an ensemble of them can outperform the "average human".
Also, I'm pretty sure that it doesn't necessarily follow that an algorithm trained by many X can outperform any X. Most humans are trained by an ensemble of humans and they don't necessarily outperform the "average human".
Mind you, I'm not saying I _know_ what "superhuman" is, but then again I'm not the one who claims to have created an example of it.
Also, here's an example where humans beat machines in image recognition:
http://www.pnas.org/content/113/10/2744.full
The task is the recognition of very small and blurry images. Several different models were used, including a very deep convnet.
From the article: "Leaf is lean and tries to introduce minimal technical debt to your stack."
What exactly does that mean?
Technical debt typically arises because the code was poorly structured or the programmer used the wrong tools/libraries (from a longer-term perspective) or didn't abstract when she should have. The current obsession with MVPs has led to an increase in technical debt.
I've seen it firsthand. Basically, it's the accumulation of suboptimal code, over time, usually due to time constraints imposed by management. In short, any time you do a dirty hack just to get something working and meet a deadline, and then don't find the time to refactor that code into a working non-hack, you have piled a bunch of manure onto the technical-debt heap. But it also seems to be a side-effect of normal code accretion to a codebase while on a team- in other words, there seems to be no way to avoid it entirely. It's like cancer, in biology. ;)
TD-ridden code is often not modular, not unit-tested, has many dependencies (spaghetti) which are then difficult to remove or replace and tend to trigger cascading bugs/failures, has too many responsibilities, has very long methods/functions, uses mutable state (changes global state which can then impact other parts of the codebase or make concurrency impossible), or is otherwise difficult to maintain.
An example of "working" tech debt is the "God class" in codebases, the model that the entire business depends on but which is over-laden with responsibilities. The risk to change it is too great (due to the business dependence) so it becomes a constant thorn in the side of maintaining the code.
The "debt" part comes from the fact that at some point you are expected to "repay" it (via costly man-hours of refactoring work). The benefit of doing so is potentially multifold, though: Faster/more modular/better-written code, faster tests (and therefore better productivity), better designs in general, more resilient code, more maintainable code, less buggy code, etc. etc.
The only known resolutions of tech debt are costly refactorings or global rewrites. The way to reduce the risk there is to first unit-test the existing code. These books help:
http://smile.amazon.com/Growing-Object-Oriented-Software-Gui...
http://smile.amazon.com/Refactoring-Improving-Design-Existin...
In my opinion this is because what people think software is.
So if you see software as code which expresses what you want, the question is what do you do when it does not do what you want wrong, or do you want something additionally.
So software really is our desire for some specific thing. But it is also a tool which can express arbitrary things. So its a mirror which reflects back on us to discover our real intentions and desires.
Eventually its more of a conversation in which you expand and direct your intentions. And programming or software is just one way to do that.
I think eventually AI will be able to deliver such reflecting conversations to us, the question would be which medium (hardware, operating system, programming language) will it use.
I do not think it will use building blocks (hardware, operating system, programming language) created by humans. Because those are to incomplete and arbitrary.
remember the building blocks allow for plenty of room to allow bootstrapping on multiple levels. An AI could create blocks to create a solution that is so very simple we can't even imagine, yet is unthinkable for humans right now.
1. The "same background" doesn't really exist for most cameras in most settings. Changes in illumination alone will make segmenting the background tricky. Moving objects in the scene will also be hard - think fountains and trees in the wind. Google for "foreground-background segmentation" to see some papers on this.
2. I haven't seen anyone use recent ML algorithms with less than high quality images. That may not matter, but it could matter a lot.
3. Extending recent ML algorithms to work with video at a high enough frame rate to be useful (10Hz at a minimum) may or may not be easy.
I'm sure that what you're proposing could be done. But I think that the number of small annoyances you'd hit would probably discourage most people who aren't treating the problem as a research exercise in Computer Vision.
If you would like to play with some of this stuff, take a look at OpenCV. http://opencv.org
Convolutional neural nets are the state of the art for this, specifically deep residual learning (http://arxiv.org/abs/1512.03385). It requires a good deal of background to understand what's going on and tune/implement the models, though, even if you just use the frameworks already out there. You probably don't even need that much data - you can probably grab pre-trained models and train them on a small additional dataset you collect.
They can definitely handle arbitrary backgrounds, although having a standard background makes the problem even easier, again.
Most deep learning computer vision algos are trained on 256x256 images, so having even larger images is just fine (you can downsample, or maybe even add up the predictions of different crops).
It's good to see alternatives to Torch, Theano, and TensorFlow, but it's important to be honest with the benchmarks so that people can make informed decisions about which framework to use.
And I don't believe the first point counts as deceptive; the bars are ordered by Forward ms, not by the sum of Forward and Backward. In both CuDNN v3 and v4, Leaf is faster than Torch by that metric (25 vs 28 for v4, 31 vs 33 for v3).
Can it get much faster than something like Torch? I would think if CuDNN is doing most of the computation time it would be hard to see big improvements. Perhaps go the route of Neon and tune your GPGPU code like crazy [1, 2], or MXNet and think about distributed computing performance [3].
[0] http://autumnai.com/deep-learning-benchmarks
[1] https://github.com/soumith/convnet-benchmarks
I think that's because they're sorting by forward time rather than forward+backward. That would also explain why in the Alexnet benchmark Tensorflow (cuDNN v4) is to the left of Caffe (cuDNN v3) despite having a much taller bar overall.
You can easily add new layer types, and recurrent connections are easy too - you just add a delay node.
Furthermore, since the configuration file format is fairly simple, it is possible to make GUI tools to visualise it and - in future - edit it.
That's not to say that Leaf won't have a DSL at some point, but we will wait until the features of the layers are a bit more stabilized and we have more clearly mapped out what goals we have for a DSL.
Do you think Data Scientists can write their models directly using Leaf? do you think there will need to be a DSL that translates form the R / Python world to something you can run on Leaf to make it happen?
I can use something like pandas or autograd to experiment with new optimization functions in seconds. For these big NN models it takes hours to days to wait for your model to train so squeezing out more performance is worth a more complex language.
Without this information it's hard to make a useful comparison at all.
The numbers in the benchmark are taken from our deep-learning-benchmarks[1] which we are still in the process of building up. It might actually make sense to test the same model with different batch sizes. The current benchmarks are based on the convnet-benchmarks[2] where the Alexnet model has a batch size of 128. (Alexnet was chosen because out of the benchmarks that's the one I am most familiar with, since it small enough that I can work with it on my Laptop)
In some informal tests Leaf was generally faster than other frameworks in smaller batch sizes, but no benchmarks that we could publish with confidence yet.
[1]: https://github.com/autumnai/deep-learning-benchmarks [2]: https://github.com/soumith/convnet-benchmarks
http://svail.github.io/rnn_perf/
It's primarily RNN-focused, but the discussion about batch sizes on GPUs is interesting.
Honesly, many modeling problems are clunky and inefficient at scale - however that's ok. When you need to scale bad enough, you already have a significant set of library support in Java to support this.
I'm failing to see an answer to the one question I have, "why rust?"
2. If "for hackers" is the new "for dummies" then gentrification is complete.