Neural networks are great at pattern recognition. Things like LSTMs allow pattern recognition through time, so they can develop "memories". This is useful in things like understanding text (the meaning of one word often depends on the previous few words).
But how can a neural network know "facts"?
Humans have things like books, or the ability to ask others for things they don't know. How would we build something analogous to that for neural network-powered "AIs"?
There's been a strand of research mostly coming out of Jason Weston's Memory Networks research[1]. This extends on that by using a new form of memory, and shows how it can perform at some pretty difficult tasks. These included graph tasks like London underground traversal.
One good quote showing how well it works:
In this case, the best LSTM network we found in an extensive hyper-parameter search failed to complete the first level of its training curriculum of even the easiest task (traversal), reaching an average of only 37% accuracy after almost two million training examples; DNCs reached an average of 98.8% accuracy on the final lesson of the same curriculum after around one million training examples.
Would this be an apt metaphor: LSTM's were like a student who had to know how to take a test and memorize how to do the problems - a DNC can learn how to take the test but it can look at its notes.
From what I can tell you can give the DNC simple inputs and it can derive complex answers.
The framework in this paper trains a neural net which interacts with a memory bank in a manner similar to a CPU. That means it can save and recall data on request, which could lead to more flexible architectures (you can give a trained net different data to recall) and easier training (since a memory-based architecture means the neural weights no longer have to learn the data along with the processing algorithm.)
i.e, in the blog post it discusses using the network to find the shortest path between two stations, would the steps to do that look like this?
1. Train the NN how to navigate any network, presenting the graph data each time you ask the NN a problem 2. take the trained NN and feed it the London Underground, then ask it to tell you how to get there?
Could it learn to use addresses that perform more interesting functions than f(x)=x?
Remembering the previous words in a sentence you're currently reading is more like short term memory though, and this paper is talking about long term memories stored as data structures outside of the neural net itself. This graphic from the DeepMind blog post might be helpful: https://i.imgur.com/KwXXCge.png.
The blog post from DeepMind is a bit more accessible than the Nature paper: https://deepmind.com/blog/differentiable-neural-computers/
An earlier form of DNC, the neural Turing machine16, had a
similar structure, but more limited memory access methods
(see Methods for further discussion).Has this been a problem for AI and CNN's?
The progress in the article is getting a learning system to do so, eventually leading us to handle unsolved problems.
What kinds of other unsolved problems are out there? I'm always looking for something interesting.
A neural network without memory can't do that or can't do it as well perhaps?