DocumentCode :
3436833
Title :
Voice conversion with a strategy for separating speaker individuality using state-space model
Author :
Xu, Ning ; Yang, Zhen ; Guo, Haiyan
Author_Institution :
Inst. of Signal Process. & Transm., Nanjing Univ. of Posts & Telecommun., Nanjing, China
fYear :
2010
fDate :
25-27 June 2010
Firstpage :
298
Lastpage :
301
Abstract :
It is well known that the key to voice conversion (VC) is to transform the spectral parameters of the source speaker to match that of the target speaker, where Gaussian mixture model (GMM) based statistical transformations have been commonly studied. However, these methods are performed using a frame-by-frame procedure, disregarding spectral envelope evolution and resulting in the significantly degraded quality of the converted speech. In this paper, we propose a new voice conversion method using the state-space model (SSM) that can essentially describe the feature of dynamics between frames. Then, physical meaning of SSM for voice conversion has been examined, leading to the novel SSM-based training and transforming procedures. Experiments using both objective and subjective measurements show that the proposed SSM-based method significantly outperforms the traditional GMM-based technique.
Keywords :
Covariance matrix; Degradation; Filtering; Graphics; Hidden Markov models; Loudspeakers; Predictive models; Signal processing; Speech; Virtual colonoscopy; Spectral envelope evolution; state-space model; voice conversion;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Wireless Communications, Networking and Information Security (WCNIS), 2010 IEEE International Conference on
Conference_Location :
Beijing, China
Print_ISBN :
978-1-4244-5850-9
Type :
conf
DOI :
10.1109/WCINS.2010.5541787
Filename :
5541787
Link To Document :
بازگشت