DocumentCode
730763
Title
AA spectral space warping approach to cross-lingual voice transformation in HMM-based TTS
Author
Hao Wang ; Soong, Frank ; Meng, Helen
Author_Institution
Dept. of Syst. Eng. & Eng. Manage., Chinese Univ. of Hong Kong, Hong Kong, China
fYear
2015
fDate
19-24 April 2015
Firstpage
4874
Lastpage
4878
Abstract
This paper presents a new approach to cross-lingual voice transformation in HMM-based TTS with only the recordings from two monolingual speakers in different languages (e.g. Mandarin and English). We aim to synthesize one speaker´s speech in the other language. We regard the spectral space of any speaker to be composed of universal elementary units (i.e. tied-states) of speech in different languages. Our approach first forces the spectral spaces of the two speakers to have the same number of tied-states. Then we find an optimal one-to-one tied-state mapping between the two spectral spaces. Hence, the mapped speech trajectory in the spectral space of the target speaker can be found according to that generated in the spectral space of the reference speaker. Consequently, we can synthesize high-quality speech for the target monolingual speaker´s voice in the other language. This can also be used as training data for a new TTS system.
Keywords
hidden Markov models; speech; speech synthesis; AA spectral space warping approach; English; HMM-based TTS; Mandarin; cross-lingual voice transformation; hidden Markov model; mapped speech trajectory; monolingual speakers; optimal one-to-one tied-state mapping; text-to-speech synthesis; Decision trees; Hidden Markov models; Mathematical model; Speech; Training; Trajectory; Transforms; HMM-based TTS; cross-lingual; spectral space warping; voice transformation;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location
South Brisbane, QLD
Type
conf
DOI
10.1109/ICASSP.2015.7178897
Filename
7178897
Link To Document