• DocumentCode
    134220
  • Title

    Integrating global variance of log power spectrum derived from LSPs into MGE training for HMM-based parametric speech synthesis

  • Author

    Yu-Sheng Sun ; Zhen-Hua Ling ; Xiang Yin ; Li-Rong Dai

  • Author_Institution
    Nat. Eng. Lab. for Speech & Language Inf. Process., Univ. of Sci. & Technol. of China, Hefei, China
  • fYear
    2014
  • fDate
    12-14 Sept. 2014
  • Firstpage
    201
  • Lastpage
    205
  • Abstract
    This paper presents a method to improve hidden Markov model (HMM) based parametric speech synthesis by integrating global variance (GV) of log power spectrum (LPS) derived from line spectral pairs (LSPs) into minimum generation error (MGE) model training. In order to alleviate the over-smoothing effect of the generated spectral structures, an LPS-GV based parameter generation method has been proposed. This method improved the naturalness of synthetic speech when LSPs were used as spectral features. However, it increased the complexity of parameter generation at synthesis time significantly. In this paper, we propose a method to integrate the distortions of LPS-GV derived from LSPs into the criterion of MGE model training in order to utilize LPSGV information at training time instead of at synthesis time. The experimental results show that this proposed method can achieve better naturalness of synthetic speech than the conventional MGE model training without loss of efficiency at synthesis time when LSPs are used as spectral features.
  • Keywords
    computational complexity; hidden Markov models; speech synthesis; HMM-based parametric speech synthesis; LPS-GV based parameter generation method; LSP; MGE training; global variance; hidden Markov model; line spectral pairs; log power spectrum; minimum generation error model training; parameter generation complexity; spectral features; synthesis time; synthetic speech; Acoustic distortion; Acoustics; Hidden Markov models; Speech; Speech synthesis; Training; Vectors; Speech synthesis; global variance; hidden Markov model; line spectral pairs; log power spectrum;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
  • Conference_Location
    Singapore
  • Type

    conf

  • DOI
    10.1109/ISCSLP.2014.6936612
  • Filename
    6936612