> But we don’t solely rely on how well we hear since we have knowledge that allows us to correct for poor hearing based on what is being said rather than forging ahead with a nonsense transcription.
Good voice transcription AI already do that too; that's why they work best if they know which language they're operating in, as that means they can use the language to create a model of the most likely words.
I think the most recent WWDC from Apple even has a video about adding custom vocabulary for their speech engine to pick up on that covered some details in this exact topic, though I can't search right now.