DocumentCode :
730661
Title :
Sparse representation for frequency warping based voice conversion
Author :
Xiaohai Tian ; Zhizheng Wu ; Siu Wa Lee ; Nguyen Quy Hy ; Eng Siong Chng ; Minghui Dong
Author_Institution :
Sch. of Comput. Eng., Nanyang Technol. Univ. (NTU), Singapore, Singapore
fYear :
2015
fDate :
19-24 April 2015
Firstpage :
4235
Lastpage :
4239
Abstract :
This paper presents a sparse representation framework for weighted frequency warping based voice conversion. In this method, a frame-dependent warping function and the corresponding spectral residual vector are first calculated for each source-target spectrum pair. At runtime conversion, a source spectrum is factorised as a linear combination of a set of source spectra in the training data. The linear combination weight matrix, which is constrained to be sparse, is used to interpolate the frame-dependent warping functions and spectral residual vectors. In this way, the proposed method not only avoids the statistical averaging caused by GMM but also preserves the high-resolution spectral details for high-quality converted speech. Experiments are conducted on the VOICES database. Both objective and subjective results confirmed the effectiveness of the proposed method. In particular, the spectral distortion dropped from 5.55 dB of the conventional frequency warping approach to 5.0 dB of the proposed method. Compare to the state-of-the-art GMM-based conversion with global variance (GV) enhancement, our method achieved 68.5 % in an AB preference test.
Keywords :
Gaussian processes; interpolation; matrix algebra; speech processing; GMM; VOICES database; frame-dependent warping function; high-quality converted speech; linear combination weight matrix; source-target spectrum pair; sparse representation framework; spectral residual vector; weighted frequency warping based voice conversion; Dictionaries; Discrete Fourier transforms; Distortion; Frequency conversion; Spectrogram; Speech; Speech processing; Voice conversion; exemplar; frequency warping; residual compensation; sparse representation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
Type :
conf
DOI :
10.1109/ICASSP.2015.7178769
Filename :
7178769
Link To Document :
بازگشت