Show HN: Affordable text-to-speech for long-form content (opens in new tab)

(audiowaveai.com)

55 pointsyagudaev2y ago50 comments

Hi HN, I’m Michael, creator of AudiowaveAI. I started this project out of frustration when I couldn't find an audiobook version of Make by Pieter Levels. The available text-to-speech options were either too robotic, overly complex, or simply too costly.

It works really well for non-fiction long-form content (i.e. hours of audio).

It’s early days for AudiowaveAI, and I’m looking for feedback to improve the product. Try it out and share your thoughts: [AudiowaveAI](https://audiowaveai.com). Thanks!

Show HN: Affordable text-to-speech for long-form content

(audiowaveai.com)

55 pointsyagudaev2y ago50 comments

It works really well for non-fiction long-form content (i.e. hours of audio).

It’s early days for AudiowaveAI, and I’m looking for feedback to improve the product. Try it out and share your thoughts: [AudiowaveAI](https://audiowaveai.com). Thanks!

50 comments

44 comments · 17 top-level

smeej2y ago· 4 in thread

Is it possible to switch back and forth between the written text and the audio, like Amazon's Whispersync? I prefer reading with my eyes when I can (especially on my ereader, so with pagination instead of scrolling), but I would love to be able to flip narration on when I need to set the book down to do something like wash my dishes, then pick the book back up when I'm done.

I've been looking for something that would let me synchronize Librivox recordings with Project Gutenberg epub files, but as much as I love the Librivox volunteers for their contributions, a lot of the recordings are such low audio quality that they're not fun to listen to. This would be a big step up, and there's no copyright worries for this use case because the works are in the public domain!

steffenhk2y ago

I've used Storyteller to create an epub book with Media Overlay but not sure it works in all ebook readers. It worked in Calibre.

https://smoores.gitlab.io/storyteller/docs/what-is-this/

smeej2y ago

My ereader is an Android tablet under the hood. I don't know of any apps that can do this on Android, but I can go hunting!

1 more reply

satvikpendem2y ago

I use the app Moon+ Reader that can do this. It uses the built-in text-to-speech engine so if someone makes another engine with more natural speech, it can plug in seamlessly.

evan_ry2y ago

I’m making such tool, you can both read and listen, although reader functionality is very simple right now

App is Listenly.io

DreaminDani2y ago· 2 in thread

This is really cool! One quick note about your marketing copy, though: > Audio for humans, not robots

There are plenty of blind folks who use traditional text to speech for navigating our devices. We prefer the robot text at ridiculously high speeds. We're humans too.

I would love the option to switch to a more natural voice for more literary text (or even a fan fic) so I'll definitely be checking this out

qchris2y ago

> I would love the option to switch to a more natural voice for more literary text (or even a fan fic) so I'll definitely be checking this out

I'm curious if it would be possible to do some kind of analysis to determine the number of individual characters in the text who are speaking, and then assign an appropriate voice to each of the characters. So if you had something like descriptive language interspersed with a conversation between two characters, that you'd have three voices (a narrator, Character A, and Character B) that are consistent across the text.

For more complex writing with many characters, you'd probably need a wide library of possible voices, and the analysis piece would need to spot-on, since it would be very confusing to have one characters' lines spoken by the wrong voice.

Regarding fanfics, many authors give (or withhold) permissions around creating derivative versions of their work via avenues like ficbinding. Before using a tool like this to create an audio version of their writing, I'd suggest reaching out to a fic's author to see if they'd be okay with that. For personal-only use, though, and especially if it's in context of accessibility for visually-impaired folks, I imagine that many of them would probably be okay with it.

nojs2y ago

This is what the best narrators do. My favourite example is Andy Serkis narrating the Hobbit.

1 more reply

lupusreal2y ago· 2 in thread

I've been using Piper for this. The quality is (in my subjective opinion) as good as the TTS built into MacOS is, it's open source, and it's so fast that you can run it in real time on a raspberry pi. On a real computer I can generate a whole audiobook in about 20 minutes.

What I do is I split the book up into sentences, generate speech for each sentence and at the same time turn that sentence into subtitles. Then I combine the two and stitch them all together into a mp4 container with audio and a subtitle track using ffmpeg. mpv (and think VLC) can display subtitles synced to audio playback even when there is no video track.

prox2y ago

Thats genius! Was it a lot of work to set up?