• DocumentCode
    3528017
  • Title

    Full covariance state duration modeling for HMM-based speech synthesis

  • Author

    Lu, Heng ; Wu, Yi-Jian ; Tokuda, Keiichi ; Dai, Li-Rong ; Wang, Ren-Hua

  • Author_Institution
    iFlytek Speech Lab., Univ. of Sci. & Technol. of China, Hefei
  • fYear
    2009
  • fDate
    19-24 April 2009
  • Firstpage
    4033
  • Lastpage
    4036
  • Abstract
    This paper proposes a state duration modeling method using full covariance matrix for HMM-based speech synthesis. In this method, a full covariance matrix instead of the conventional diagonal covariance matrix is adopted in the multi-dimensional Gaussian distribution to model the state duration of each context-dependent phoneme. At synthesis stage, the state durations are predicted using the clustered context-dependent distributions with full covariance matrices. Experimental results show that the synthesized speech using full-covariance state duration models is more natural than the conventional method when we change the speaking rate of synthesized speech.
  • Keywords
    Gaussian distribution; covariance matrices; hidden Markov models; pattern clustering; speech synthesis; HMM-based speech synthesis; clustered context-dependent distribution; context-dependent phoneme; full covariance matrix state duration modeling; hidden Markov model; multidimensional Gaussian distribution; Computer science; Context modeling; Covariance matrix; Flowcharts; Gaussian distribution; Hidden Markov models; High temperature superconductors; Predictive models; Speech synthesis; Training data; HMM; duration; full covariance; speech synthesis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
  • Conference_Location
    Taipei
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-2353-8
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2009.4960513
  • Filename
    4960513