Sure, but there's a TON of out-of-band data in voice that cannot be easily replicated in text, that's a fact.
Voice tone, pitch, pauses, irony, sarcasm (both easier to misunderstand in text), etc.
And in person or voice chat have the highest signal-to-noise additional data stream that is body language. "The mouth says yes, but the eyes say no."