undefined | Better HN

0 pointsadastra222y ago0 comments

In actual tests it is beyond human level. Humans actually mishear about 1 in 20 words during transcription tests; whisper does better.

0 comments

emodendroket2y ago

But we don’t solely rely on how well we hear since we have knowledge that allows us to correct for poor hearing based on what is being said rather than forging ahead with a nonsense transcription. Machine transcription is definitely faster and cheaper but the end product isn’t “better,” and anyone who has read it can attest to that.

ben_w2y ago

> But we don’t solely rely on how well we hear since we have knowledge that allows us to correct for poor hearing based on what is being said rather than forging ahead with a nonsense transcription.

Good voice transcription AI already do that too; that's why they work best if they know which language they're operating in, as that means they can use the language to create a model of the most likely words.

I think the most recent WWDC from Apple even has a video about adding custom vocabulary for their speech engine to pick up on that covered some details in this exact topic, though I can't search right now.

emodendroket2y ago

Undoubtedly so but I have yet to see one that doesn't make mistakes a human would be unlikely to. It is not an easy capability to reproduce and wouldn't have been my first choice if I wanted to talk about things it can do better than people.

1 more reply

cypress662y ago

Well, those "actual tests" clearly don't reflect reality. This is obvious if you actually use whisper.

j / k navigate · click thread line to collapse

0 comments

emodendroket2y ago

ben_w2y ago

emodendroket2y ago

1 more reply

cypress662y ago

Well, those "actual tests" clearly don't reflect reality. This is obvious if you actually use whisper.

j / k navigate · click thread line to collapse