The Brazilian Portuguese model is a bit of an extreme showcase (and thus really cool!), as it was trained on a single speaker (entirely recorded by the main author of the paper, Edresson Casanova, who's Brazilian).
The fact that it can do multi-lingual voice cloning at all in that case is already surprising. You can find more details in the project page [0] and paper [1]. And here's the corpus. [2]
The fact that it can do multi-lingual voice cloning at all in that case is already surprising. You can find more details in the project page [0] and paper [1]. And here's the corpus. [2]
[0] https://edresson.github.io/YourTTS/
[1] https://arxiv.org/abs/2112.02418
[2] https://edresson.github.io/TTS-Portuguese-Corpus/