Hybrid computing using a neural network with dynamic external memory (opens in new tab)

(nature.com)

153 pointsidunning9y ago38 comments

38 comments

32 comments · 15 top-level

the_decider9y ago· 3 in thread

Some interesting ideas sadly blocked behind a pay-wall journal, all for the purpose of boosting a researcher's prestige because they now hold a "Nature" publication. Thankfully, this article is easily accessible via Sci-Hub. http://www.nature.com.sci-hub.cc/nature/journal/vaop/ncurren...

superfx9y ago

Here's an official, publicly accessible, link to the article: http://rdcu.be/kXhV

jedharris9y ago

Not downloadable though. Provided as a distraction from the paywall.

daveloyall9y ago

Did you paste the correct link? When I follow that one, I end up on a page with a few sentences of Cyrillic characters. I picked a button that was probably download and I landed on a captcha that I wasn't able to pass after three tries.

nl9y ago· 3 in thread

This is probably the most important research direction in modern neural network research.

Neural networks are great at pattern recognition. Things like LSTMs allow pattern recognition through time, so they can develop "memories". This is useful in things like understanding text (the meaning of one word often depends on the previous few words).

But how can a neural network know "facts"?

Humans have things like books, or the ability to ask others for things they don't know. How would we build something analogous to that for neural network-powered "AIs"?

There's been a strand of research mostly coming out of Jason Weston's Memory Networks research[1]. This extends on that by using a new form of memory, and shows how it can perform at some pretty difficult tasks. These included graph tasks like London underground traversal.

One good quote showing how well it works:

In this case, the best LSTM network we found in an extensive hyper-parameter search failed to complete the first level of its training curriculum of even the easiest task (traversal), reaching an average of only 37% accuracy after almost two million training examples; DNCs reached an average of 98.8% accuracy on the final lesson of the same curriculum after around one million training examples.

[1] https://arxiv.org/pdf/1410.3916v11.pdf

gallerdude9y ago

Thank you, I think I understand this now. So now we can train a model that doesn't have to learn everything from its weights alone.

Would this be an apt metaphor: LSTM's were like a student who had to know how to take a test and memorize how to do the problems - a DNC can learn how to take the test but it can look at its notes.

petra9y ago

If it sucseeds and scales, it seems very close to AGI, right ?

nl9y ago

No-where near it. So far away that it is almost completely nonsensical to talk about it.

I guess it is unlikely that one could have an AGI without some kind of memory, so there is that.

1 more reply

kylek9y ago· 3 in thread

I'm probably totally off base here (neural networks/AI is not my wheelhouse), but is having "memory" in neural networks a new thing? Isn't this just a different application of a more typical 'feedback loop' in the network?

choxi9y ago

You're correct in a way, you can think of neural nets "remembering" the data set they're trained on. Recurrent neural nets even explicitly have a "feedback loop" like you're referring to that allows them to "remember" previous samples. An example of that is in natural language processing where you want to be able to remember the previous words in a sentence to interpret the current word.

Remembering the previous words in a sentence you're currently reading is more like short term memory though, and this paper is talking about long term memories stored as data structures outside of the neural net itself. This graphic from the DeepMind blog post might be helpful: https://i.imgur.com/KwXXCge.png.

The blog post from DeepMind is a bit more accessible than the Nature paper: https://deepmind.com/blog/differentiable-neural-computers/

modeless9y ago

The "memory" in a typical recurrent neural network is akin to a human's short term working memory. It only holds a few things and forgets old things quickly as new things come in. This new memory can hold a large number of things and stores them for an unlimited amount of time, more like a human's long term memory or a computer's RAM.

AlexCoventry9y ago

From the paper's "System Overview":

  An earlier form of DNC, the neural Turing machine16, had a
  similar structure, but more limited memory access methods
  (see Methods for further discussion).

bluetwo9y ago· 3 in thread

One of the examples given is a block puzzle (reorder 8 pieces in a 3x3 grid back into order)

Has this been a problem for AI and CNN's?

cscurmudgeon9y ago

That problem was solved by a non learning AI system decades ago. Current theorem provers (a related field of AI) solve problems like this in fractions of a second.

The progress in the article is getting a learning system to do so, eventually leading us to handle unsolved problems.

taneq9y ago

This seems like a bit of an unfair comparison. The 'decades ago' solution was a system, built by humans, that can solve this problem (and very closely related ones), whereas this solution is a system, built by humans, that can design a system to solve the problem.

2 more replies

bluetwo9y ago

Interesting. I have an AI agent that can solve these types of problems.

What kinds of other unsolved problems are out there? I'm always looking for something interesting.

1 more reply

0xdeadbeefbabe9y ago· 2 in thread

> a DNC can complete a moving blocks puzzle in which changing goals are specified by sequences of symbols

A neural network without memory can't do that or can't do it as well perhaps?

AlexCoventry9y ago

In fig. 5a, they compare its performance to that of an LSTM trained on the same problem, and it does seem to do much better.

prats2269y ago

I am guessing for an LSTM based neural network to learn memory sequencing for the purpose of solving problems will need a much deeper and wider network which a separate memory block tries to provide with ready made logic and much simpler network so it doesn't have to learn those actions.

gallerdude9y ago· 1 in thread

Can someone explain what the full implications of this are? This seems really cool, but I can't really wrap my head around it.

From what I can tell you can give the DNC simple inputs and it can derive complex answers.

AlexCoventry9y ago

It separates the concern of memorization from those of training and processing. In most current neural architectures, patterns in the training data are implicitly represented in the trained neural weights, and the net is implicitly forced to develop recall of past events by transmitting them from each time step to the next via neural net outputs.

The framework in this paper trains a neural net which interacts with a memory bank in a manner similar to a CPU. That means it can save and recall data on request, which could lead to more flexible architectures (you can give a trained net different data to recall) and easier training (since a memory-based architecture means the neural weights no longer have to learn the data along with the processing algorithm.)

gallerdude9y ago· 1 in thread

Does this mean we could get way better versions of char-rnn?

bbctol9y ago

This hopefully could replace current char-rnn with something very different. Char-rnn is a long short term memory system, where recurrence in the structure of the neural network allows short-term information to persist and inform future actions. This paper almost mimics the brain's separate long and short term memory structures, and could store long-term memory separately from its main activities until needed.

ktamiola9y ago· 1 in thread

This is remarkable!

aminorex9y ago

^ This is called proof by construction.

idunningOP9y ago

Blog post for the paper: https://deepmind.com/blog/differentiable-neural-computers/

triplefloat9y ago

Very exciting extension of Neural Turing Machines. As a side note: Gated Graph Sequence Neural Networks (https://arxiv.org/abs/1511.05493) perform similarly or better on the bAbI tasks mentioned in the paper. The comparison to existing graph neural network models apparently didn't make it into the paper (sadly).

bra-ket9y ago

if you're interested in this check out "Reasoning, Attention, Memory (RAM)" NIPS Workshop 2015 organized by Jason Weston (Facebook Research): http://www.thespermwhale.com/jaseweston/ram/

foota9y ago

I have a couple questions that I'm not getting from this, does this memory persist between each "instance" of a task? Or does it get wiped out after each one? Is this something where you might say present the model with some data that is the input (which it might learn to then store in memory) and then ask a question of it?

i.e, in the blog post it discusses using the network to find the shortest path between two stations, would the steps to do that look like this?

1. Train the NN how to navigate any network, presenting the graph data each time you ask the NN a problem 2. take the trained NN and feed it the London Underground, then ask it to tell you how to get there?

zardo9y ago

Instead of saving the data, you could think of using a memory address as applying the identity function and saving the data.

Could it learn to use addresses that perform more interesting functions than f(x)=x?

prats2269y ago

Would love to see if these networks learn concepts of fast retrieval for eg indexing etc

plg9y ago

but why use an ANN for tasks involving symbolic logic? I don't get it. It's like ANNs are jumping the shark

j / k navigate · click thread line to collapse

38 comments

32 comments · 15 top-level

the_decider9y ago· 3 in thread

superfx9y ago

Here's an official, publicly accessible, link to the article: http://rdcu.be/kXhV

jedharris9y ago

Not downloadable though. Provided as a distraction from the paywall.

daveloyall9y ago

nl9y ago· 3 in thread

This is probably the most important research direction in modern neural network research.

But how can a neural network know "facts"?

Humans have things like books, or the ability to ask others for things they don't know. How would we build something analogous to that for neural network-powered "AIs"?

One good quote showing how well it works:

[1] https://arxiv.org/pdf/1410.3916v11.pdf

gallerdude9y ago

Thank you, I think I understand this now. So now we can train a model that doesn't have to learn everything from its weights alone.

Would this be an apt metaphor: LSTM's were like a student who had to know how to take a test and memorize how to do the problems - a DNC can learn how to take the test but it can look at its notes.

petra9y ago

If it sucseeds and scales, it seems very close to AGI, right ?

nl9y ago

No-where near it. So far away that it is almost completely nonsensical to talk about it.

I guess it is unlikely that one could have an AGI without some kind of memory, so there is that.

1 more reply

kylek9y ago· 3 in thread

choxi9y ago

The blog post from DeepMind is a bit more accessible than the Nature paper: https://deepmind.com/blog/differentiable-neural-computers/

modeless9y ago

AlexCoventry9y ago

From the paper's "System Overview":

  An earlier form of DNC, the neural Turing machine16, had a
  similar structure, but more limited memory access methods
  (see Methods for further discussion).

bluetwo9y ago· 3 in thread

One of the examples given is a block puzzle (reorder 8 pieces in a 3x3 grid back into order)

Has this been a problem for AI and CNN's?

cscurmudgeon9y ago

That problem was solved by a non learning AI system decades ago. Current theorem provers (a related field of AI) solve problems like this in fractions of a second.

The progress in the article is getting a learning system to do so, eventually leading us to handle unsolved problems.

taneq9y ago

2 more replies

bluetwo9y ago

Interesting. I have an AI agent that can solve these types of problems.

What kinds of other unsolved problems are out there? I'm always looking for something interesting.

1 more reply

0xdeadbeefbabe9y ago· 2 in thread

> a DNC can complete a moving blocks puzzle in which changing goals are specified by sequences of symbols

A neural network without memory can't do that or can't do it as well perhaps?

AlexCoventry9y ago

In fig. 5a, they compare its performance to that of an LSTM trained on the same problem, and it does seem to do much better.

prats2269y ago

gallerdude9y ago· 1 in thread

Can someone explain what the full implications of this are? This seems really cool, but I can't really wrap my head around it.

From what I can tell you can give the DNC simple inputs and it can derive complex answers.

AlexCoventry9y ago

gallerdude9y ago· 1 in thread

Does this mean we could get way better versions of char-rnn?

bbctol9y ago

ktamiola9y ago· 1 in thread

This is remarkable!

aminorex9y ago

^ This is called proof by construction.

idunningOP9y ago

Blog post for the paper: https://deepmind.com/blog/differentiable-neural-computers/

triplefloat9y ago

bra-ket9y ago

if you're interested in this check out "Reasoning, Attention, Memory (RAM)" NIPS Workshop 2015 organized by Jason Weston (Facebook Research): http://www.thespermwhale.com/jaseweston/ram/

foota9y ago

i.e, in the blog post it discusses using the network to find the shortest path between two stations, would the steps to do that look like this?

zardo9y ago

Instead of saving the data, you could think of using a memory address as applying the identity function and saving the data.

Could it learn to use addresses that perform more interesting functions than f(x)=x?

prats2269y ago

Would love to see if these networks learn concepts of fast retrieval for eg indexing etc

plg9y ago

but why use an ANN for tasks involving symbolic logic? I don't get it. It's like ANNs are jumping the shark

j / k navigate · click thread line to collapse