- https://huggingface.co/spaces/Xenova/whisper-web
- https://huggingface.co/spaces/Xenova/whisper-webgpu
- https://huggingface.co/spaces/Xenova/realtime-whisper-webgpu
- https://huggingface.co/spaces/webml-community/moonshine-web
How can I understand what's in the compiled JS though? Is there some source for that?
Here I'm talking about the model shared in this thread, which is text-to-speech (reading out loud content from the web)
I made https://app.readaloudto.me/ as a hobby thing and now it could be enhanced with a local tts option!
(I get the joke that for some definition of real-time this is real-time).
The reason why I use an API is because time to first byte is the most important metric in the apps I'm working on.
That aside, kudos for the great work and I'm sure one day the latency on this will be super low as well.
Sounds great on Chrome with an Nvidia 1650Ti.
Sounds great on Chrome on a Pixel 6.
Sound like being bitcrushed. Maybe a 64 vs 32 bit error? Solid results when working.
Edit: Sorry, it was a problem of my specific audio setup, it works equally well on Chromium.
Is there source anywhere? Seems the assets/ folder is bundled js. In my opinion, there's a ton of opportunity for private, progressive web apps with this while WebGPU is still relatively newly implemented.
Would love to collaborate in some way if others are also interested in this
[0] https://github.com/C-Loftus/QuickPiperAudiobook/ [1] https://github.com/rhasspy/piper/issues/352
But, in a more serious tone: the story that I hear about AMD GPUs is that they are, in fact, shittier because AMD themselves give fewer shits. GIGO
this is astounding
https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_...
Quality sounded good compared to a lot of other small TTS models I've tried.