Given how many deep learning toolkits have popped up and the complexities involved, these evaluations are quite useful. Code comparisons would be even better but I feel that's asking too much given how quickly things move :)
One important note, if you're looking to actually do standard tasks, use a higher level library. My favourite is fchollet's Keras[1] given that it supports using both Theano and TensorFlow as backends. It will likely support more in the future, giving you better performance (i.e. Neon is likely the next contender) and helping prevent legacy toolkit specific code.
I'm confident many of the issues with TensorFlow will be cleared up sooner than later, especially given it was only open sourced a month and a half ago. As an example, bidirectional RNNs are trivial to implement by yourself (~5 lines of code) but additionally TensorFlow has working code for that already in TensorFlow 0.6[2], the API is just not publicly listed.
Most important for this, single device performance is already at cuDNNv2 Torch levels for TensorFlow given the 0.6 update[3]. Both soumith and this evaluation haven't updated the numbers but Google replicated the benchmarks and presented them at NIPS. They're working on adding cuDNNv3 support, which should be another speed jump.
I'm a fan of Keras too. Including comments on Keras and other higher-level libraries would make my original review too long, hence I dropped it.
Note that the performance review is very much incomplete. As mentioned in the blog: Deep Learning is not just about feed-forward convnets, not just about ImageNet, and certainly not just about a few passes over the network. However, Soumith’s benchmark is the only notable one as of today. So we will base the Single-GPU performance rating based on his benchmark.
For TF, I think a bigger single-node perf issue is memory allocation. At NIPS, Jeff Dean didn't have a straight forward answer for why TF's memory perf is so poor.
One important note, if you're looking to actually do standard tasks, use a higher level library. My favourite is fchollet's Keras[1] given that it supports using both Theano and TensorFlow as backends. It will likely support more in the future, giving you better performance (i.e. Neon is likely the next contender) and helping prevent legacy toolkit specific code.
I'm confident many of the issues with TensorFlow will be cleared up sooner than later, especially given it was only open sourced a month and a half ago. As an example, bidirectional RNNs are trivial to implement by yourself (~5 lines of code) but additionally TensorFlow has working code for that already in TensorFlow 0.6[2], the API is just not publicly listed.
Most important for this, single device performance is already at cuDNNv2 Torch levels for TensorFlow given the 0.6 update[3]. Both soumith and this evaluation haven't updated the numbers but Google replicated the benchmarks and presented them at NIPS. They're working on adding cuDNNv3 support, which should be another speed jump.
[1]: http://keras.io/
[2]: https://github.com/tensorflow/tensorflow/blob/master/tensorf...
[3]: https://twitter.com/deliprao/status/673888452736245760