The results are not specific to neural networks (similar techniques could be used to fool logistic regression). The problem is that ultimately a trained network relies heavily on certain activation pathways which can be precisely targeted (given full knowledge of the network) to fool networks into misclassification on data points which might to a human seem imperceptibly changed from those which are correctly classified. It is important to understand adversarial cases, but unreasonable to get carried away with sweeping pronouncements about what this does or doesn't about all neural networks, let alone intelligence generally, or the entire enterprise of AI research, as seems to happen after a splashy headline.
I understand that if you have a fixed training data set of, say, 10 million images, there are probably going to be adversarial examples that will be specific to a net trained on that set. (These may be the same artifacts that are picked up by overfitting, or they could be local underfitting artifacts due to early stopping.) But are they specific to the data set?
In other words, are the same adversarial examples going to "break" a neural net trained on a separate (but identically generated) set of 10 million images? Or is this something that can be smoothed away with, say, ensemble methods and multiple data sets?
The "polynomials" are the "nice images" and the "continuous with unbounded variation" are the "noisy ones".
Disclaimer: I am neither a lawyer not an expert on NN.
Edit: I would like to be corrected if I the similitude is wrong.
I agree that it doesn't seem to be a huge cause of concern for the general use case, but it does make me question the use of machine learning algorithms for computer security, an idea which I've seen a few companies pursuing nowadays.
Also a reason why it's good to be leery when people say, "I don't need to know what's going on in the model, the results are good enough."
In a way this is interesting because it's a sort of visualization of what the network views as important in discriminating between different objects. It's also interesting as a display of how alien the learned model's view of the world is.
Take optical illusions...optical illusions are remotely similar to this sort of exploit, although the sort of scene modeling we do is a lot more complex than recognition or decomposition. Anyways, illusions exploit cues that result in distorted recognition but not drastically so, unlike the case for these networks. My guess is that this is due to animal vision using a lot more high level cues -- cues that are also useful in a natural setting -- depending on things like size, color, shade, lines, context and so on. Visual systems are also a lot more proactive, filtering out things that don't make sense, fudging color at the edges of vision, smoothing out shades and generally making inferences and deductions about what it should be seeing and how things are "supposed" to be. In fact, a good number of illusions exploit those aspects of vision.
In the case of these networks, the cues are incomprehensible, having no natural counterpart, so we see most of them as noise. But sometimes they make a kind of sense, as in the starfish, baseball and sunglasses examples. Based on the observations in the paper, I would guess only a handful activations strongly associated to each feature are responsible for each susceptibility.
With animal brains the distortions usually end up in a slightly transformed space, a different scaling or something. It's useful to match a bit overzealously and get something like pareidolia but it also makes sense to have the conflations actually be like something you might run into. The ANNs have no such incentive.
Their paper also wonders about whether this is unique to discriminative classifiers. Would a generative classifier, with access to a proper distribution, be so susceptible? That'd be very interesting to see.
They also mention some real world consequences, some of which I disagree with. Neural Networks are good at interpolating between examples, so if your training has good coverage over what is to be expected then it'll work very well. And in the era of big data this isn't really a problem (that they don't generalize as we do might explain some of why they have trouble with abstract images) so I'm skeptical an image search solution would be thrown off by textures.
There is, however, a better example of facial or speaker recognition. For example, you could train a network to distinguish between faces or voices and then evolve a pattern against it. This could then be used in such a way as to be randomly matched to an individual on a target database. Not good. Driverless cars are also mentioned but those are typically augmented beyond just vision. Personally, I'd add medical scans to the list of things to be careful with.
Finally, it's worth mentioning that some of the evolved images are inspired works of art. And a few of the images optimized (not evolved) with an L2 penalization are recognizable without the label and a few more where you can see why it gave the label it did.
Your offhanded dismissal was unwarranted IMO.
-
I'm willing to bet that in late 70s many people also thought that rule-based systems could lead to real AI if they were "more complex" and had more computing power available.
Do you have any references to support this?
What I mean is, humans have a 6th sense that tells us when things are 'off', and require further examination. This comes from having 5 senses that are highly tuned to the world around us, and work incredibly well together. In some ways, physical hardware has limitations on it's ability to 'sense' the outside world, but in other ways, the ability to analyze and collect hard data is significantly more powerful than what humans can achieve, even at a subconscious level. Do current neural network algorithms not have this kind of failsafe?
In fact optical illusions are probably the best example of this; even knowing that what you perceive is not correct doesn't change your perception.
Do you have any references or examples where such behavior has been demonstrated on non-NN algorithms (like logistic regression)?
[1] http://www.reddit.com/r/MachineLearning/comments/2lmo0l/ama_...
Financial accounting can be thought as system that tries to make it easy to differentiate between irregularities and normal operation from few summary documents. The summary documents like balance sheets are abstractions of the original records and you can normally detect financial problems easily using them.
But if you have intelligent adversary within the accounting process that tries to forge the result, you must look into the original records and go trough them to find forgery.
With that in mind, is it really surprising that [m]any of our attempts at emulating intelligence can be easily fooled? An untold number of species have evolved to do exactly the same thing: exploit the pattern matching errors of predators to disguise themselves as leaves or tree branches or venomous animals that the predator avoids like the plague. DNNs seem to be relatively new and we've got a long ways to go, so is this a fundamental problem with the theoretical underpinning or do we just need to train them with far more contextualized data (for lack of a better phrase)?
Is there any chance of us having accurate DNNs if we can, as if gods during the course of natural selection, peek into the brain of predators (algorithms) and reverse engineer failures (disguises for prey) like this?
How can we build a mostly automated future, if the AIs that are supposed to do our jobs turn out to be very fallible as well? They won't - supposedly - have the problem of being self-aware and being able to follow their emotions rather than their own best judgement and reasoning. But it seems that some problems are inherently prone to making mistakes. Can it be avoided at all? And if so, who do we blame when an AI makes a "mistake" like that? The training set?
* AIs can focus all their capacities on a single task for an unlimited time, while a human can focus for a couple of hours each day. (That’s 5-6X thinking time each day.)
* More importantly, AIs have faster access to knowledge systems, other AIs and computation resources (e.g. for simulations and prototyping). For an AI, it will possibly take only in the order of 10e-3 to 10e-2 s to query and interpret information while humans are more in the ballpark of 10e0 to 10e2 s.
* Another advantage is that AIs could fork modified versions of themselves which possibly results in an exponential evolution. The rate of evolution could possibly about 10e8 times higer compared to humans (several seconds vs. 25 years).
To state that AIs can be foiled by a specifically crafted adversarial attack, does not mean the AI is very fallible. Under normal circumstances it still outperforms humans. But let's say the error rate of a net is on average 3%, and an attack works only if it is crafted for a specific net (where the weights are known etc.). Like a team of humans is able to come up with better solutions than individual team members, so does an ensemble of nets (usually) outperform any of its individual members. For instance, the final model generalizes better, because it uses information and predictions from all the nets. Foil one out of ten nets, and the other nine will cancel out the bad prediction with their votes.
In an automated future jobs will be taken up by AI. A security guard then becomes a supervisor: 100s of nets will try to detect disturbances in a mall, and when they find one, they notify the supervisor. A judgement call is then made by a human. This is currently happening with law expert systems. A judge will input the case and get a prediction for punishment. Then make a final adjustment to this punishment, looking at the context of the case. Such a system prevents racially based sentences (the AI will not care about race, but look at precedent), while still giving emotions, judgment and reasoning a final say.
> it seems that some problems are inherently prone to making mistakes. Can it be avoided at all?
Some problems are incomputable, like finding the shortest program to reproduce a larger string. In others, like lossy compression, mistakes may be made in representing lossless information, that humans are unable to spot. I think this is a very interesting, but difficult question to answer.
> who do we blame when an AI makes a "mistake" like that? The training set?
We blame the AI researcher that build the net :). And she will blame the training set :).
If you train 1000 AIs on different subsets of your training corpus, their ensemble will be much "hardier" than one AI trained on the entire corpus. The automated future comes from the fact that you didn't need 1000 full training corpii to get this effect, nor do 1000 AIs cost much more than one to run, once you've built out hardware enough for one.
In other words, AI makes the application of "brute-force intelligence" to a large problem cheap enough to be feasible, in the same way slave labor made building pyramids by brute force cheap enough to be feasible.
The textbook solution for your problem is to throw multiple methods at a problem, and then when they disagree let a human use their judgement.
This is the best of both worlds: computers do the easy, repetitive stuff that humans find boring. Then the tricky things humans use their judgement on.
Given a visual face recognition door lock or similar system. If I want to break such a door lock, can I install that system at home, train it with secretly taken pictures of an authorized person, and evolve some kind of key picture with my home system until I can show it to the target door lock and fool it into giving me access?
OK this is a very simplified way to put the question, but is that something this paper would imply to be possible (in a more sophisticated way)?
A someone who actively researches biometrics I can't say that this is a good method for a few reasons.
1) Systems often train templates which look very different than the original input, especially if more than one image is involved in training. These templates aren't necessarily going to be recognizable to the first system (even if they can be represented as a 2D image).
2) Many enterprise systems (such as from Honeywell or whoever) include liveness tests and spoofing measures. Though anecdotally they are not very good, they check for basic measures such as if the pupil expands and contracts from a burst of light.
3) Most biometrics that involve access to some place (verification) usually include a 3rd party monitoring said access.
If you were to do this for say someone's home. Depending on the system you may gain access with a high definition photo as many consumer systems are set to a higher false accept rate (FAR) to prevent user aggravation. However, if they set it to be very strict (giving a larger false reject rate) then the best way would probably be attack at their sensor directly. That is, the system often doesn't care about the surroundings, it's trained for one task (open if authorized user).
The speakers runs you through a hypothetical case study: a pet-door company and looks at the pitfalls of applying machine learning to it.
I believe the paper is actually focusing on something else: Create images that humans will not be able to classify as a digit, but that the net will gladly give a prediction for. To translate to faces: There may be clouds that look random to our human brain, but are detected as faces by nets.
It seems this is an adversarial attack, where you need access to the guts of the net (weights, layers). I compare this with a hashing algorithm and brute-forcing the input till you find a collision with a target. Nearly impossible in real-life situations.
You may be able to sign a check using a scribble that the cashier can not recognize as a digit, but that the machine will recognize as a digit. Not much practical gain from an attack there.
But in general, this class of problems is why a biometric lock is only useful if accompanied by a guard with a gun stationed next to it.
However, you will need the database or training data fed into that lock system. Without the data, you won't be able to test it at home.
...But, how different is this from the various optical illusions humans fall for? I mean we can't exactly tell the difference between a rabbit and duck ourselves[1] so isn't it just a universal property of all neural-network like systems that there will be huge areas of mis-classifications for which there hasn't been specific selection?
Human brain recognizes better because it can sample the image many times from many slightly different angles. There's a reason saccade (http://en.wikipedia.org/wiki/Saccade) exists.
There are two parts of this process that are kind of flaky. The problem above is one of them. The other part is feature extraction where the feature set is learned from the training set. The features thus selected are chosen somewhat randomly and are very dependent on the training set. It's amazing to me that works at all. Earlier thinking was to have some canonical set of features (vertical lines, horizontal lines, various kinds of curves, etc.), the idea being to mimic early vision, the processing that happens in the retina. Automatic feature choice apparently outperforms that, but may not really be working as well as previously believed.
It's great seeing all this progress being made.
I recently attended a talk by Geoff Hinton on "capsules." He pointed out that the max pooling used in convolutional neural networks effectively disregards information about relationships among features. Instead, he propose a network composed of "capsules" that each estimate whether an implicitly defined intermediate feature is present and its pose. The idea is that an object is present only if its intermediate features are present and their poses agree. He showed some neat results from these models (some published in http://arxiv.org/pdf/1412.1897v1.pdf, and some from http://www.cs.utoronto.ca/~tijmen/tijmen_thesis.pdf). Notably, these models can evidently learn to classify MNIST with >98% accuracy given only 25 labeled examples. (I am not sure how many unlabeled examples were used.) I don't have any experience with these models, but given that most of these images look like a single feature embedded in noise or as a texture, I would not be surprised if a capsule-based network would not be so susceptible to these images.
It would be interesting to introduce some kind of aspects known from the human brain and see if the misclassified items "move" in some conceptually understandable direction.
* Introduce time. Humans are not just image classifiers; humans are able to recognize objects in visual streams of images. Such streams can be seen as latent variables that introduce correlations over time as well as space. What constitutes spatial noise might very well be influenced in our brains by the temporal correlations we see as well.
* Introduce saccades. A computer is only able to see a picture from one viewpoint. Our eyes undergo saccades and microsaccades. That's an unfair advantage for us, being able to see a picture multiple times from different directions!
* Introduce the body. We can move around an object. This again introduces correlations that 1.) are available to us, and 2.) might define priors even when we are not able to move around the picture. In other words, we can (unconsciously) rotate things in our head.
http://arxiv.org/abs/1312.6199.
And the article I got that references it.
http://www.i-programmer.info/news/105-artificial-intelligenc...
Then again, unlike the neural networks in the paper, humans would be capable of classifying abstract images into a separate category if asked.
Although I doubt the visual cortex is a simple feed forward network like the one used in the paper. It's likely to have a non linear structure that's significantly more complex.