• DocumentCode
    430199
  • Title

    Improving the performance of MGM-based voice conversion by preparing training data method

  • Author

    Zuo, Guo-Yu ; Liu, Wen-Ju ; Ruan, Xiao-gang

  • Author_Institution
    Inst. of Autom., Acad. Sinica, Beijing, China
  • fYear
    2004
  • fDate
    15-18 Dec. 2004
  • Firstpage
    181
  • Lastpage
    184
  • Abstract
    This paper proposes an approach to improve both the target speaker´s individuality and the quality of the converted speech by preparing the training data. In mixture Gaussian spectral mapping (MGM) based voice conversion, spectral feature representations are analyzed to obtain the right feature associations between the source and target characteristics. A voiced and unvoiced (V/U-V) decision scheme for time-alignment is provided to obtain the right data for training the MGM function while removing the misaligned data. Experiments are conducted in terms of the applications of spectral representation methods, and V/UV decision strategies, to the MGM functions. When linear predictive cepstral coefficients (LPCC) are used for time-alignment and the V/UV decisions are adopted for removing bad data, results show that the conversion function can get a better accuracy and the proposed method can effectively improve the overall performance of voice conversion.
  • Keywords
    Gaussian distribution; cepstral analysis; signal representation; speech processing; LPCC; MGM-based voice conversion; conversion function accuracy; converted speech quality; linear predictive cepstral coefficients; mixture Gaussian spectral mapping voice conversion; prepared training data method; source/target feature associations; speaker individuality; spectral feature representations; voiced/unvoiced time-alignment decision scheme; Automation; Cepstral analysis; Control engineering; Covariance matrix; Laboratories; Loudspeakers; Pattern recognition; Speech analysis; Telephony; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing, 2004 International Symposium on
  • Print_ISBN
    0-7803-8678-7
  • Type

    conf

  • DOI
    10.1109/CHINSL.2004.1409616
  • Filename
    1409616