Title :
Voice conversion with a strategy for separating speaker individuality using state-space model
Author :
Xu, Ning ; Yang, Zhen ; Guo, Haiyan
Author_Institution :
Inst. of Signal Process. & Transm., Nanjing Univ. of Posts & Telecommun., Nanjing, China
Abstract :
It is well known that the key to voice conversion (VC) is to transform the spectral parameters of the source speaker to match that of the target speaker, where Gaussian mixture model (GMM) based statistical transformations have been commonly studied. However, these methods are performed using a frame-by-frame procedure, disregarding spectral envelope evolution and resulting in the significantly degraded quality of the converted speech. In this paper, we propose a new voice conversion method using the state-space model (SSM) that can essentially describe the feature of dynamics between frames. Then, physical meaning of SSM for voice conversion has been examined, leading to the novel SSM-based training and transforming procedures. Experiments using both objective and subjective measurements show that the proposed SSM-based method significantly outperforms the traditional GMM-based technique.
Keywords :
Covariance matrix; Degradation; Filtering; Graphics; Hidden Markov models; Loudspeakers; Predictive models; Signal processing; Speech; Virtual colonoscopy; Spectral envelope evolution; state-space model; voice conversion;
Conference_Titel :
Wireless Communications, Networking and Information Security (WCNIS), 2010 IEEE International Conference on
Conference_Location :
Beijing, China
Print_ISBN :
978-1-4244-5850-9
DOI :
10.1109/WCINS.2010.5541787