• DocumentCode
    730760
  • Title

    Modulation spectrum-constrained trajectory training algorithm for GMM-based Voice Conversion

  • Author

    Takamichi, Shinnosuke ; Toda, Tomoki ; Black, Alan W. ; Nakamura, Satoshi

  • Author_Institution
    Grad. Sch. of Inf. Sci., Nara Inst. of Sci. & Technol. (NAIST), Nara, Japan
  • fYear
    2015
  • fDate
    19-24 April 2015
  • Firstpage
    4859
  • Lastpage
    4863
  • Abstract
    This paper presents a novel training algorithm for Gaussian Mixture Model (GMM)-based Voice Conversion (VC). One of the advantages of GMM-based VC is computationally efficient conversion processing enabling to achieve real-time VC applications. On the other hand, the quality of the converted speech is still significantly worse than that of natural speech. In order to address this problem while preserving the computationally efficient conversion processing, the proposed training method enables 1) to use a consistent optimization criterion between training and conversion and 2) to compensate a Modulation Spectrum (MS) of the converted parameter trajectory as a feature sensitively correlated with over-smoothing effects causing quality degradation of the converted speech. The experimental results demonstrate that the proposed algorithm yields significant improvements in term of both the converted speech quality and the conversion accuracy for speaker individuality compared to the basic training algorithm.
  • Keywords
    Gaussian processes; mixture models; speaker recognition; speech processing; GMM-based voice conversion; Gaussian mixture model; consistent optimization criterion; conversion accuracy improvement; converted parameter trajectory; converted speech quality improvement; modulation spectrum compensation; modulation spectrum-constrained trajectory training algorithm; over-smoothing effects; quality degradation; speaker individuality; Gold; Hafnium; Pragmatics; Speech; Training; GMM-based voice conversion; modulation spectrum; over-smoothing; training algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
  • Conference_Location
    South Brisbane, QLD
  • Type

    conf

  • DOI
    10.1109/ICASSP.2015.7178894
  • Filename
    7178894