Title :
Voice conversion based on joint pitch and spectral transformation with component group-GMM
Author :
Ma, Jianchun ; Liu, Wenju
Author_Institution :
Nat. Lab. of Pattern Recognition, Chinese Acad. of Sci., Beijing, China
fDate :
30 Oct.-1 Nov. 2005
Abstract :
Spectral and pitch are two most important features in voice conversion which including a majority of speaker identity information. Some researchers use the GMM (Gaussian mixture model) to model the joint spectral and pitch. But these two features have the discrepancy of unit and meaning, so should do some processing before training the model. In this paper, a new framework CG-GMM (component-group GMM) is used for the joint pitch and spectral transformation. Experiments are setup and compared with the previous approach of voice conversion. The converted speeches indicate satisfactory speech quality and speaker identifiability. Meanwhile the speaking style is much like to the target speaker.
Keywords :
speaker recognition; spectral analysis; speech processing; speech synthesis; GMM; Gaussian mixture model; component group-GMM; pitch transformation; speaker identifiability; speaker identity information; speaking style; spectral transformation; speech quality; voice conversion; Artificial neural networks; Data mining; Discrete transforms; Laboratories; Maximum likelihood estimation; Natural languages; Pattern recognition; Spatial databases; Speech analysis; Speech synthesis;
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2005. IEEE NLP-KE '05. Proceedings of 2005 IEEE International Conference on
Print_ISBN :
0-7803-9361-9
DOI :
10.1109/NLPKE.2005.1598734