undefined | Better HN

0 pointsjonplackett2y ago0 comments

I wonder when computers will start taking our intonation into account too. That would really help with understanding the end of a phrase. And there’s SO MUCH information in intonation that doesn’t exist in pure text. Any AI that doesn’t understand that part of language will always still be kinda dumb, however clever they are.

0 comments

2 comments · 2 top-level

modeless2y ago

You're right. Ultimately the only way this will really work is as an end-to-end model. Text will only get you so far. We could approximate it now with screenplay-like emotion annotations on text, which LLMs should both easily understand and be able to produce themselves (though you'd have to train a new speech recognition system to produce them). But end-to-end will be required eventually to reach human level fluency.

hk__22y ago

Don’t they do it already? There are a lot of languages where intonation is absolutely necessary to distinguish between some words, so I would be surprised that this not already taken into account by the major voice assistants.

1 more reply

j / k navigate · click thread line to collapse