Using LLMS:
- formulate the right prompts for the intro and outro generation
- pass the content of a post in segments while maintaining history, as if you do in one go you will exceed token limit
- figure out how to integrate comments properly
- turn the summary into spoken format, not condensed written
Using TTS: - train the right voice, one that fits the content. Not all voices of a TTS engine have the same characteristics.
- understand the bugs of the TTS engine. For example Elevenlabs that we're using (and its beyond amazing overall and the team fantastic), is struggling when given this "$2.5". It will read it out "dollar 2(long pause) 5".
- a few more things
Overall:
- Figure out how to connect all of the different segments, music intros, outros etc