But that's hardware accelerated. There's a lot of brain given over to doing just that. When you stop that working by presenting the conversation in an incompatible codec (threaded), you switch over to the equivalent of software decoding.
And that is as bad an idea for conversation as it is for h.264.