- https://huggingface.co/spaces/Xenova/whisper-web
- https://huggingface.co/spaces/Xenova/whisper-webgpu
- https://huggingface.co/spaces/Xenova/realtime-whisper-webgpu
- https://huggingface.co/spaces/webml-community/moonshine-web
Quality sounded good compared to a lot of other small TTS models I've tried.
But, in a more serious tone: the story that I hear about AMD GPUs is that they are, in fact, shittier because AMD themselves give fewer shits. GIGO
https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_...
Sounds great on Chrome with an Nvidia 1650Ti.
Sounds great on Chrome on a Pixel 6.
Sound like being bitcrushed. Maybe a 64 vs 32 bit error? Solid results when working.
Edit: Sorry, it was a problem of my specific audio setup, it works equally well on Chromium.
I made https://app.readaloudto.me/ as a hobby thing and now it could be enhanced with a local tts option!
(I get the joke that for some definition of real-time this is real-time).
The reason why I use an API is because time to first byte is the most important metric in the apps I'm working on.
That aside, kudos for the great work and I'm sure one day the latency on this will be super low as well.
Is there source anywhere? Seems the assets/ folder is bundled js. In my opinion, there's a ton of opportunity for private, progressive web apps with this while WebGPU is still relatively newly implemented.
Would love to collaborate in some way if others are also interested in this
[0] https://github.com/C-Loftus/QuickPiperAudiobook/ [1] https://github.com/rhasspy/piper/issues/352
this is astounding