Speech-dispatcher commonly uses espeak-ng, which sounds robotic but is reportedly better for visually impaired users, because at higher speeds it is still intelligible. This allows visually impaired users to hear UI labels more quickly. For non visually impaired users, we generally want natural sounding voices and to use TTS in the same way we would listen to podcasts or a bedtime story.
With this system, users are in full control and can swap TTS models easily. If a model is shipped and, two weeks later, a smaller, newer, or better one appears, their work would become obsolete very quickly.