I am fairly suspicious of this new product given that text translation is not exactly a solved problem. My wife and I travel a great deal so we have plenty of opportunities to use google translate in particular.
Currently we are in Spain and translate emails to and from the person who owns the apartment we are staying in. The understandability of the translations is pretty variable. One or two typos seems to be all it takes to see it produce something unintelligible. Also, the locals in the area we are in frequently speak both Spanish and Catalan. I suspect our landlord may be throwing in the occasional Catalan word because some words (that looks Spanish to me) go untranslated or are translated to something nonsensical.
Every email of a few paragraphs requires five minutes of my wife and I sitting with both the original message and the google translation in front of us deciphering them. Occasionally we have to bounce "was this what you meant?" emails back and forth with our landlord.
So yeah, unless their text translation is much much better than google translate, if they are going to layer speech recognition on top of that, good luck to them.
I'm not sure what to think about this. I get it that everyone wants all of the world's data in order to build better software. The only NDA I have ever signed at my current job is about confidentiality and privacy of customer data. Not software though, yay! That's why I can still share my GNU Octave and Mercurial code with the whole world!
At the same time, I think it's almost fundamentally impossible to truly anonymise data without rendering that data useless for the very purpose it serves. All of these randomised IDs are just privacy theatre, to adapt a phrase from Bruce Schneier. The same machine learning tools that can help you use speech data to improve translation can be used to deanonymise this data.
Our science fiction writings are full of how we are going to build intelligent machines that would one day control us and destroy us when they become evil. I think that narrative may need to be adapted so that instead of instead of intelligent killer robots, we have intelligent killer data in the hands of evil humans.
They don't admit to doing it historically, they say that if you sign up to the preview that they will use your conversations to improve the product going forward.
http://www.zdnet.com/article/apple-stores-your-voice-data-fo...