Definitely looks promising. I'm curious if there are any user-friendly ways to use this or similar WaveNet text to speech? Last time I looked into it, it still required a fair amount of processing power and a library of speech files to generate something on your own.
Serious question, how does this stand out from speech synthesis we have had for years? The only problem that I see that this solves is the different pronunciations of "read".