Why can't Torch utilize more threads in CPU cores?
Taken from the article itself:
"both of them cannot run normally when threads usage is set to be bigger than the number of CPU cores on desktop CPU."
Do the authors set up the system correctly?
You're right that Torch is faster than TensorFlow in RNN. But Torch is slower than TesnorFlow in AlexNet and ResNet.
There is a set of benchmarks for many DL approaches as found in https://github.com/soumith/convnet-benchmarks