story

Eager Execution: An imperative, define-by-run interface to TensorFlow (opens in new tab)

research.googleblog.com

125 pointsalextp8y ago35 comments

35 comments

You can read out more about it in the blog post ( https://research.googleblog.com/2017/10/eager-execution-impe... ) or the README ( https://github.com/tensorflow/tensorflow/tree/master/tensorf... ). This is still a preview release, so you may hit some rough edges.

Looking forward to your feedback as you try it out.

josh11b8y ago

I'm on the team that worked on this -- happy to answer questions!

gormanc8y ago

Hot damn this has got me all giddy. How will this work on single node multi-GPU systems? For example, with PyTorch you have to either use threading, multiprocessing, or even MPI. Can you think of a not-too-scary way to use eager execution with multiple GPUs?

alextpOP8y ago

We're still fairly early in the project, so for now threading is the only supported way.

We can do better, however, and we're working on ways to leverage the hardware better (for example, if you have no data-dependent choices in your model we can enqueue kernels in parallel on all GPUs in your machine at once from a single python thread, which will perform much better than explicit python multithreading).

Stay on the lookout as we release new experimental APIs to leverage multiple GPUs and multiple machines.

chrisprobert8y ago

Announcing TensorFlow's new development roadmap mandate: copy everything PyTorch is doing :-)

ychujo28y ago

I think you mean Google is following the leadership of Chainer, like Facebook already does? PyTorch started as a Chainer fork. Its dynamic graph internals are all from Chainer.

bradleyjg8y ago

This isn't art. There are no points for originality. If open source projects borrow the best parts from each other, that's a good thing.

ychujo28y ago

It's not a bad thing. It's good for users. But give credit to the leaders in the field. If you make an iPod clone, you call it an iPod clone, not a clone of the Zume HD.

Chainer started it, was around years earlier, and it still has more users. So Google is not copying PyTorch, it's copying Chainer.

1 more reply

chrisprobert8y ago

Totally agree!

ma2rten8y ago

This is the first time I am hearing this. I though pytorch was based on torch (like the name implies). Do you have a reference or more information?

Narew8y ago

PyTorch use the same backend as torch (cutorch for GPU ...) But PyTorch use almost the same Python API than Chainer. On this point we can say PyTorch "copy" Chainer.

tempw8y ago

Based on your reasoning PyTorch is copying TensorFlow static optimizations and production capability with JIT and ONNX then? I've seen many folks requesting an imperative API.

You can't please everybody, as if they listen or not to users people still complain. If both are making effort to improve themselves though, the community has only to benefit from this competitiveness.

make38y ago

I'm usually against this type of framework baiting, but being a tensorflow guy myself & having just spent the week coding with pytorch full time.... this is basically identical to pytorch

brittohalloran8y ago

What are the strengths and weaknesses of each? I've been using keras but planning on diving into a real deal framework next. Tensorflow is appealing for the momentum it has in the community, but pytorch looks easier to learn.

Doing image classification, object localization, and homography (given an input image, which of my known template images is matches it and in what orientation).

alextpOP8y ago

I think Keras is a real deal framework. It provides a higher-level API than most other frameworks, but it has pretty sweet portability of models across frameworks and platforms and most research papers are implementable in Keras without too much trouble.

1 more reply

solomatov8y ago

That's actually very good that they are copying good things from other frameworks.

make38y ago

the question now is, are tensorflow eager's RNN as slow as pytorch's are?

yablak8y ago

(I'm author of the TF rnn api & tf.contrib.seq2seq)

There's a lot of work being done on this specific part. If you have a standard RNN architecture you want to run, you can probably use the cudnn code in tf.contrib.cudnn to get a super fast implementation.

There is some performance work that needs to be done on properly caching weights between time steps of an RNN if you use a tf.nn.RNNCell. Currently if you want to implement a custom architecture, or a seq2seq decoder, or an RL agent, this is the API you would want to use. Several of the eager benchmarks are based on this API; so that performance will only improve.

I'm hopeful that for the next major release, we'll also have support for eager in tf.contrib.seq2seq.

1 more reply

congerous8y ago

TensorFlow: everything to all people.

Eager is actually not as innocent as "open-source projects borrowing the best parts from each other", as some commenters here suggest.

Google is attempting to dominate the machine-learning API and the Python ecosystem for scientific computing.

The company that controls the API influences which apps are built on it and how. Think about how Google bundled Android services on top of Android, and how that posed an existential threat to other companies. That's what's coming for TensorFlow. Many developers are too naive to realize it, or too short-sighted to care.

tree_of_item8y ago

Huh? They're attempting to dominate the machine learning ecosystem by writing a bunch of free and high quality machine learning libraries? What exactly are they doing wrong?

I wouldn't compare a permissively licensed library to Android services at all.

congerous8y ago

I'm surprised I have to write this, but Google is not a charity. They are pouring commercial resources into Tensorflow for a reason. That reason is Google Cloud. Tensorflow is a Trojan horse to get people to use Google Cloud and other paid Google products. How do I know this? Because Tensorflow works better on Google Cloud than anywhere else, and Google is making a concerted effort to catch up with AWS in cloud, mostly through machine learning.

I didn't compare Tensorflow to Android services. I said that Tensorflow would serve as the basis of a service bundle, much like Android did. Let's come back in a couple years and I'll tell you I told you so.

matt40778y ago

> I'm surprised I have to write this,

Insulting the reader

> but Google is not a charity

truism

> They are pouring commercial resources...

As opposed to "non-commercial resources"?

> ... for a reason.

Everything happens for a reason.

> That reason is Google Cloud.

> How do I know this?

Pray tell!

> Because Tensorflow works better on Google Cloud than anywhere else.

This is the only real argument in this conspiracy. And if "anywhere" includes the users' hardware, it's wrong: tensorflow runs flawlessly on any Linux/NVIDIA hardware. Maybe it works better with GCE than AWS, but that would once again fall into that "rather unsurprising" category of factoids.

> Google is making a concerted effort to catch up with AWS in cloud, mostly through machine learning.

This can be re-written as "Google has a cloud offering, which it tries to sell. And right now, machine learning is pretty hot". Throwing a "concerted effort" in there is just trying to jazz it up to something ominous. Which it isn't.

> I didn't compare Tensorflow to Android services. I said that Tensorflow would serve as the basis of a service bundle, much like Android did.

"The basis of a service bundle" actually doesn't sound that scary. Nobody is disputing that Google offers services build on tensorflow. It just isn't any sort of "Trojan horse" conspiracy, and it is somewhat limited by the fact the tensorflow is OSS licensed and could be forked by anybody people suddenly find out it's full of geek soldiers.

1 more reply

yorwba8y ago

In what way is Tensorflow working better on Google Cloud? Are they tuning the ML code for specifics of their infrastructure or does Google Cloud just have more tooling for Tensorflow?

1 more reply

sandGorgon8y ago

Hey guys, if I could request... Please fix the serialization story for tensorflow. There 6 googleable methods to export from tensorflow and nobody knows what will work on the cloud, what can be exported from cloudml and what can be loaded on Android.

It has to be consistent and there has to be one way to do it.

I personally have a 10 message thread with Google cloud support on exporting a Cloud trained model to tensorflow and nobody could figure it out [Case #13619720].

alextpOP8y ago

Did you try using SavedModel? It should be seamless to use downstream with tensorflow serving and it's not that hard to get estimators to spit those out.

sandGorgon8y ago

I really wish. https://github.com/tensorflow/tensorflow/issues/12750

In fact if you dig up the case, then even official support told me that savedmodel needs some freezing using bazel otherwise it doesn't work.

The github page and stackoverflow are full of these. If you can, please take the message to the other side :(

I don't think the cloud guys (where training will happen in distributed mode) talk to the android guys (where models will be used after quantization). There is a huge serialization problem that all of us are currently struggling with.

alextpOP8y ago

Ah, I didn't know SavedModel didn't work in android. I think freezing is still the way to go there? I'm sorry, I don't personally work on the mobile side of things.

1 more reply

j / k navigate · click thread line to collapse

35 comments

alextpOP8y ago

Looking forward to your feedback as you try it out.

josh11b8y ago

I'm on the team that worked on this -- happy to answer questions!

gormanc8y ago

alextpOP8y ago

We're still fairly early in the project, so for now threading is the only supported way.

Stay on the lookout as we release new experimental APIs to leverage multiple GPUs and multiple machines.

chrisprobert8y ago

Announcing TensorFlow's new development roadmap mandate: copy everything PyTorch is doing :-)

ychujo28y ago

I think you mean Google is following the leadership of Chainer, like Facebook already does? PyTorch started as a Chainer fork. Its dynamic graph internals are all from Chainer.

bradleyjg8y ago

This isn't art. There are no points for originality. If open source projects borrow the best parts from each other, that's a good thing.

ychujo28y ago

It's not a bad thing. It's good for users. But give credit to the leaders in the field. If you make an iPod clone, you call it an iPod clone, not a clone of the Zume HD.

Chainer started it, was around years earlier, and it still has more users. So Google is not copying PyTorch, it's copying Chainer.

1 more reply

chrisprobert8y ago

Totally agree!

ma2rten8y ago

This is the first time I am hearing this. I though pytorch was based on torch (like the name implies). Do you have a reference or more information?

Narew8y ago

PyTorch use the same backend as torch (cutorch for GPU ...) But PyTorch use almost the same Python API than Chainer. On this point we can say PyTorch "copy" Chainer.

tempw8y ago

Based on your reasoning PyTorch is copying TensorFlow static optimizations and production capability with JIT and ONNX then? I've seen many folks requesting an imperative API.

make38y ago

I'm usually against this type of framework baiting, but being a tensorflow guy myself & having just spent the week coding with pytorch full time.... this is basically identical to pytorch

brittohalloran8y ago

Doing image classification, object localization, and homography (given an input image, which of my known template images is matches it and in what orientation).

alextpOP8y ago

1 more reply

solomatov8y ago

That's actually very good that they are copying good things from other frameworks.

make38y ago

the question now is, are tensorflow eager's RNN as slow as pytorch's are?

yablak8y ago

(I'm author of the TF rnn api & tf.contrib.seq2seq)

I'm hopeful that for the next major release, we'll also have support for eager in tf.contrib.seq2seq.

1 more reply

congerous8y ago

TensorFlow: everything to all people.

Eager is actually not as innocent as "open-source projects borrowing the best parts from each other", as some commenters here suggest.

Google is attempting to dominate the machine-learning API and the Python ecosystem for scientific computing.

tree_of_item8y ago

Huh? They're attempting to dominate the machine learning ecosystem by writing a bunch of free and high quality machine learning libraries? What exactly are they doing wrong?

I wouldn't compare a permissively licensed library to Android services at all.

congerous8y ago

matt40778y ago

> I'm surprised I have to write this,

Insulting the reader

> but Google is not a charity

truism

> They are pouring commercial resources...

As opposed to "non-commercial resources"?

> ... for a reason.

Everything happens for a reason.

> That reason is Google Cloud.

> How do I know this?

Pray tell!

> Because Tensorflow works better on Google Cloud than anywhere else.

> Google is making a concerted effort to catch up with AWS in cloud, mostly through machine learning.

> I didn't compare Tensorflow to Android services. I said that Tensorflow would serve as the basis of a service bundle, much like Android did.

1 more reply

yorwba8y ago

In what way is Tensorflow working better on Google Cloud? Are they tuning the ML code for specifics of their infrastructure or does Google Cloud just have more tooling for Tensorflow?

1 more reply

sandGorgon8y ago

It has to be consistent and there has to be one way to do it.

I personally have a 10 message thread with Google cloud support on exporting a Cloud trained model to tensorflow and nobody could figure it out [Case #13619720].

alextpOP8y ago

Did you try using SavedModel? It should be seamless to use downstream with tensorflow serving and it's not that hard to get estimators to spit those out.

sandGorgon8y ago

I really wish. https://github.com/tensorflow/tensorflow/issues/12750

In fact if you dig up the case, then even official support told me that savedmodel needs some freezing using bazel otherwise it doesn't work.

The github page and stackoverflow are full of these. If you can, please take the message to the other side :(

alextpOP8y ago

Ah, I didn't know SavedModel didn't work in android. I think freezing is still the way to go there? I'm sorry, I don't personally work on the mobile side of things.

1 more reply

j / k navigate · click thread line to collapse