The Neural GPU the Neural RAM machine

Ilya Sutskever, Google Brain.

We present two new architectures for learning algorithmic concepts that have short description length. Our first model, the Neural GPU, is a computationally efficient model that can learn algorithmic concepts, where success is measured by the model's ability to correctly classify test instances that are much longer than the longest training instance. We show that it succeeds on several tasks, including the long multiplication of binary numbers. Our second model, the Neural Random-Access machine, is a Neural Turing Machine model that has primitives that support the explicit manipulation and dereferencing of pointers. We show that this model can learn to solve several simple tasks that involve data structures that use pointers, such as linked lists and binary trees.