DocumentCode :
2798871
Title :
Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis
Author :
Oura, Keiichiro ; Tokuda, Keiichi ; Yamagishi, Junichi ; King, Simon ; Wester, Mirjam
Author_Institution :
Dept. of Comput. Sci. & Eng., Nagoya Inst. of Technol., Nagoya, Japan
fYear :
2010
fDate :
14-19 March 2010
Firstpage :
4594
Lastpage :
4597
Abstract :
In the EMIME project, we are developing a mobile device that performs personalized speech-to-speech translation such that a user´s spoken input in one language is used to produce spoken output in another language, while continuing to sound like the user´s voice. We integrate two techniques, unsupervised adaptation for HMM-based TTS using a word-based large-vocabulary continuous speech recognizer and cross-lingual speaker adaptation for HMM-based TTS, into a single architecture. Thus, an unsupervised cross-lingual speaker adaptation system can be developed. Listening tests show very promising results, demonstrating that adapted voices sound similar to the target speaker and that differences between supervised and unsupervised cross-lingual speaker adaptation are small.
Keywords :
hidden Markov models; natural language processing; speech synthesis; EMIME project; HMM-based TTS; HMM-based speech synthesis; mobile device; speech-to-speech translation; unsupervised cross-lingual speaker adaptation; word-based large-vocabulary continuous speech recognizer; Automatic speech recognition; Computer science; Databases; Decision trees; Hidden Markov models; Loudspeakers; Natural languages; Speech analysis; Speech recognition; Speech synthesis; HMM-based speech synthesis; unsupervised cross-lingual speaker adaptation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
ISSN :
1520-6149
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2010.5495558
Filename :
5495558
Link To Document :
بازگشت