This paper is misleading in calling their method a "deep neural network". It's only "deep" in the sense that it involves stacking multiple physical layers, not in the machine learning sense. From a neural network point of view, it's a single layer network, because the entire thing is linear. 92% accuracy on MNIST is exactly what you can achieve with a single layer network:
https://www.tensorflow.org/versions/r1.2/get_started/mnist/b...