ReferenceError: customElements is not defined
Also apparently some assertion errors with webcomponents (minified so line numbers not useful).
I don't want to be unreasonable, but Google used to at least generally support the idea of the open web. There are a bunch of different UAs out there; while I accept it's more challenging to support some than others, it doesn't seem unreasonable to expect a product launch from a large-scale web company should at the very least give us an error message.
The web is deteriorating right in front of us, and a big contributor to that is Google's continued failure to realise that the web isn't all-Chrome, all the time. The attitude displayed here—a minor thing when compared to the overall problem—has strengthened my resolve to avoid Chrome at all costs.
This is a Machine Learning product, I don't think anyone at Google, at least as part of this team, is trying to "get you" or destroy the open web or something. This isn't even a case of Google using something non-standard -- WebComponents is part of the standard, you can even see it in Mozilla's MDN [0]. Firefox, Safari, Edge, et al simply haven't implemented it yet (or landed in stable). Is that somehow also Chrome's fault?
Filing a bug report is good, but ranting on HN about how this is a sign of Google trying to steal the open Internet is at best unnecessary, and absolutely unreasonable.
Coincidently, I'm working on an application that uses complicated SVG with CSS animations, and I've spent a ton of time optimizing it. I've never tested it outside of Chrome before today. To my surprise, while everything works fast in Chrome, in Safari it's bearable, but in FF it's simply too laggy to use. Now, I probably won't ever get to fix the performance issues in FF and Safari, simply because I don't have the time. Am I also out there trying to destroy the open web? Maybe I'm just bad, not evil.
[0]: https://developer.mozilla.org/en-US/docs/Web/Web_Components
Other browser vendors don't support WebComponents out of the box (yet).
Yes, but history teaches us that they are terrible at product management. I mean, they could at least just call it "Chrome TTS" or something until they put the whole thing together.
(Should be as "simple" as a polyfill for webcomponents, but I don't want to put words in the team's mouth)
[0]: https://developer.mozilla.org/en-US/docs/Web/Web_Components/...
Call me paranoid all you want, but that is their plan. just like it was microsoft's plan for nothing on IE4/6 to work on netscape. Heck, a third of my corporate sites have DHTML popups saying "Use chrome" when i reach them with firefox already.
16 * (4.5 * 110 * 60) / 1M = $0.475/hr
16 * (4.5 * 150 * 60) / 1M = $0.648/hr
If you multiply by the number of 50ms in one second (20), you do get $9.5 - $12.96I was messing around with the ancient VBA text to speech system. If most TTS systems sucked as much as that one, you could also make a SAAS business for finding "typos" that make the word sound correct when pronounced.
You should also include a workload volatility component to be entirely fair. Your analysis assumes it's entirely steady state.
(Work at g)
I've been waiting a long time for decent sounding open source TTS software for narrating books to me, and now with deep learning it's either here or very near here, and the hardware is going to keep getting more performant at the same price. I guess that will be very appealing to businesses relying on TTS (e.g. call centers and phone robots and mobile apps with TTS, etc)
This is with 1 minute of audio and 10 minutes of training, which is crazy to me. Maybe it's not "as good", but it's very good, and free, and it will get better, faster, and cheaper quickly?
That is the hope but there are no guarantees. Perhaps specialized hardware can pick up where Moore's Law has tapered off.
There's no end in sight to the improvement of neural hardware, not like the wall x86 CPUs have hit anyway.
https://research.googleblog.com/2018/03/expressive-speech-sy...
The last set is specifically interesting for your wish.
HN discussion here: https://news.ycombinator.com/item?id=16691197
* that's how i call google/alphabet when it doesn't matter which side of the tax-avoiding entity i am referring to.
EDIT: this is almost exactly their sample application (https://github.com/GoogleCloudPlatform/python-docs-samples/t...). Was able to get it working with epubs using pypandoc within the hour. Now just need to make it upload to Overcast...
EDIT 2: Can now convert epubs directly to mp3s on Overcast. Yay!
Uses Amazon Polly
(overcast uploading not shown–that's a separate script using mechanize)
You get e.g. ze Dscherman aczent or de frensch onehe.
Polly is priced at $4 per million characters and the Google WaveNet voices are $16 (compared with the Google non-WaveNet voices, which are also $4).
After listening to a few samples from each service, the voice quality and prosody modeling seem roughly on par between Polly and WaveNet, or at least the differences I heard didn't seem to justify a 4x price multiplier.
But I'd love to hear an informed opinion from someone with more expertise...
So in fact WaveNet competes more with voiceover and new use-cases such as voice assistants. Still I don't hear that much difference there today, but maybe WaveNet will improve in the future to human level sooner than the other models.
I could listen to this voice for a while, but the voice needs more emotion in it before it could be actually useful for long text.
I have plans to leave.
If you stress the word "plans", the sentence means that the speaker is not necessarily intending to actually leave. However, when the stress is on "leave", the speaker definitely intends to leave. A human reader can easily infer the correct meaning from context but text-to-speech systems can't because they don't have any systematic understanding of the things being talked about and the social pragmatics of the discourse. As long as these issues aren't solved, text-to-speech systems will make mistakes. These mistakes will be easy to spot in some cases but can also have catastrophic consequences in other cases: "I have plans to bomb North Korea."
https://github.com/pndurette/gTTS
Very simple and has the Google Assistant voice.
I think this is the missing thing that was needed to make this viable.
Google Assistant already has all the pieces of that (maybe not all the social media connections one might want, I haven't looked much at that), and the ability to string them together.
Hey Google, Tell me about my Day.
and
Hey Google, Tell me the news.
There's a way you can get the news added to your daily briefing (the first trigger), but I can't remember how now.
https://www.pastery.net/nujfhw/
I have no idea what the rate limits are, so please don't abuse it, I wrote it because the demo didn't work in Firefox and I wanted to play around with it more extensively.
"I'm sorry Dave. I'm afraid I can't do that."
to be prepared for whats coming ...
You upload your music to the cloud, set some parameters (genre, tempo, emotion, etc) and a bunch of lyrics and the thing will spit out awesome vocals for you.
https://cloud.google.com/text-to-speech/docs/quickstart https://cloud.google.com/speech/docs/sync-recognize
Edit: There is one, on the actual Google Cloud Text-To-Speech page, so a few clicks in and you'll get one.
The fact that the preview only seems to work on Chrome (and silently breaks everywhere else) is not cool, thought.
The voices did sound quite natural and "news-readery", however the one issue I did find is adding a pause between words.
With the example phrase: "He bought himself a boat and then took it to his house". You often expect a small pause after the word "boat".
I was able to manually fix it by adding some commas and full stops, however the AI was not able to pick up those pauses naturally.
It sounded like someone was rushing through the speech instead of stopping occasionally to "take a breath".
Requires Chrome.