A text-to-speech (TTS) model. Most good TTS models are closed-source. I intend on making this one open-source.
All the decent open-source ones are fairly basic with limited fine tuning and no alignment (RLHF).
I plan on adding those things. Although I am not sure if there will be any demand for it. Plus, there's a decent chance meta will make llama 4 speech output making this one obsolete.
All the decent open-source ones are fairly basic with limited fine tuning and no alignment (RLHF).
I plan on adding those things. Although I am not sure if there will be any demand for it. Plus, there's a decent chance meta will make llama 4 speech output making this one obsolete.