1. Speech to text
2.fix up/edit text with GPT-3
3.text to speech in the original speaker's voice(s), preserving prosody and inflection with Vall-e.
If done with every participant's consent I don't see how it's not legitimate.
1. Speech to text
2.fix up/edit text with GPT-3
3.text to speech in the original speaker's voice(s), preserving prosody and inflection with Vall-e.
If done with every participant's consent I don't see how it's not legitimate.