Title :
C27. Investigation about speech conversion using different techniques
Author :
Elmanfaloty, Rania A. ; Korany, Noha O. ; Youssef, El-Sayed A.
Author_Institution :
Fac. of Eng., Alexandria Univ., Alexandria, Egypt
Abstract :
Voice conversion (VC) is a process which modifies the speech signal produced by one source speaker so that it sounds like another target speaker. In this paper we compare two techniques for voice conversion. In the first technique, a conversion function based on Gaussian mixture model (GMM) is used for transforming the spectral envelope described by line spectral frequencies (LSF) parameters and the linear predictive coefficients (LPC) residuals or Mel frequency cepstral coefficients (MFCC) parameters. The second technique uses Pitch Synchronous Overlap Add (PSOLA) and resampling. The comparison between the two techniques is based on subjective evaluation, also objective evaluation such as mean squared error (MSE) and pitch estimation.
Keywords :
Gaussian processes; mean square error methods; speech processing; Gaussian mixture model; Mel frequency cepstral coefficient; conversion function; line spectral frequency; linear predictive coefficient; mean squared error; pitch estimation; pitch synchronous overlap add; resampling method; source speaker; spectral envelope conversion; speech conversion; target speaker; voice conversion; Educational institutions; Estimation; Mel frequency cepstral coefficient; Speech; Speech processing; Training; Vectors; GMM; LSF; MFCC; PSOLA; VC;
Conference_Titel :
Radio Science Conference (NRSC), 2012 29th National
Conference_Location :
Cairo
Print_ISBN :
978-1-4673-1884-6
DOI :
10.1109/NRSC.2012.6208545