• DocumentCode
    535049
  • Title

    An algorithm for Chinese Voice conversion based on phonetic Gaussian mixture model

  • Author

    Li, Yanping ; Zhang, Linghua ; Ding, Hui

  • Author_Institution
    Coll. of Telecommun. & Inf. Eng., Nanjing Univ. of Posts & Telecommun., Nanjing, China
  • Volume
    7
  • fYear
    2010
  • fDate
    16-18 Oct. 2010
  • Firstpage
    3490
  • Lastpage
    3494
  • Abstract
    This paper proposed a novel algorithm for Chinese voice conversion based on phonetic Gaussian mixture model. The proposed method implemented spectral feature conversion for each category phoneme based on phonetic Gaussian mixture model, which prevented the spectral smoothing of traditional Gaussian mixture model (GMM) and avoided phoneme imbalance between training and testing materials in order to improve voice intelligibility and naturalness. Furthermore, the modification of pitch was achieved by manipulating the linear prediction-residual with the help of the knowledge of instants of significant excitation in order to improve the quality of synthesis speech. First, similarity to the target voice spectral was evaluated in an objective test and it was shown that the proposed algorithm improved similarity by 9.31% compared with GMM. In subjective listening test, an ABX test was performed and the proposed algorithm was preferred over the baseline algorithm by 10.36%, and improved quality by 29.33% in terms of mean opinion score (MOS).
  • Keywords
    Gaussian processes; feature extraction; smoothing methods; spectral analysis; speech synthesis; ABX test; Chinese voice conversion algorithm; GMM; baseline algorithm; linear prediction-residual manipulation; mean opinion score; phonetic Gaussian mixture model; spectral feature conversion; spectral smoothing; speech synthesis quality; subjective listening test; Classification algorithms; Covariance matrix; Signal processing algorithms; Speech; Testing; Training; Vectors; Chinese vowel mapping; Gaussian Mixture Model; phoneme classification; voice conversion;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image and Signal Processing (CISP), 2010 3rd International Congress on
  • Conference_Location
    Yantai
  • Print_ISBN
    978-1-4244-6513-2
  • Type

    conf

  • DOI
    10.1109/CISP.2010.5646756
  • Filename
    5646756