• DocumentCode
    3124129
  • Title

    Resonance-based spectral deformation in HMM-based speech synthesis

  • Author

    Jinfu Ni ; Shiga, Yoshinori ; Kawai, Hiroyuki ; Kashioka, Hideki

  • Author_Institution
    Spoken Language Commun. Lab., Universal Commun. Res. Inst., Kyoto, Japan
  • fYear
    2012
  • fDate
    5-8 Dec. 2012
  • Firstpage
    88
  • Lastpage
    92
  • Abstract
    Speech quality in statistical parametric speech synthesis relies on a sufficiency of acoustical features involved in training samples. This paper presents a spectral deformation method by using spectral-spatial information to expand the density space of acoustical features when limited training samples are available. It makes observed mel-cepstra diffused in a resonance field and achieves multiple spectral variants subject to a resonance mechanism. A statistical contribution of the mel-cepstral variants takes the place of the original while building HMM-based voices. Preliminary speech synthesis experiments are carried out in Chinese and Japanese. The experimental results indicate that the proposed method is able to improve potential discontinuity and enhance speech formants for noise reduction while achieving at least as good MOS quality as using the original.
  • Keywords
    hidden Markov models; speech synthesis; Chinese; HMM-based speech synthesis; HMM-based voices; Japanese; MOS quality; acoustical features; noise reduction; observed mel-cepstra; resonance mechanism; resonance-based spectral deformation; spectral-spatial information; speech formants; statistical parametric speech synthesis; training samples; Hidden Markov models; High temperature superconductors; Mel frequency cepstral coefficient; Mirrors; Speech; Speech synthesis; Training; Spectral deformation; resonances; statistical parametric speech synthesis; voicefonts;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on
  • Conference_Location
    Kowloon
  • Print_ISBN
    978-1-4673-2506-6
  • Electronic_ISBN
    978-1-4673-2505-9
  • Type

    conf

  • DOI
    10.1109/ISCSLP.2012.6423478
  • Filename
    6423478