Sorry for the confusing phrasing about STT vs TTS. I'm not familiar with cases where you would use something like this 'at the edge' instead of say a laptop. I was thinking maybe some sort of offline setup with a microphone -- but in that case the audio is just real-time. Do you have some use cases in mind?
1/4 of the price for 1/3 of the speed is a good deal! Presumably still faster than faster-whisper on the same hardware?