undefined | Better HN

0 pointsandberx2mo ago0 comments

This is really cool. Voice cloning + translation in one pipeline is something a lot of content creators would pay for right now. Especially for YouTube dubbing where you want to keep the original personality of the speaker.

Are you handling the speech-to-text, translation, and voice synthesis as separate steps or is it more of an end-to-end model? Curious how you deal with things like pacing and intonation that don't always carry over between languages.

0 comments

No comments yet.