• DocumentCode
    1937786
  • Title

    Speech-rate-variable HMM-based Japanese TTS system

  • Author

    Iwano, K. ; Yamada, Makoto ; Togawa, T. ; Furui, S.

  • Author_Institution
    Tokyo Institute of Technology
  • fYear
    2002
  • fDate
    11-13 Sept. 2002
  • Firstpage
    219
  • Lastpage
    222
  • Abstract
    This paper proposes a new method for controlling phoneme duration according to arbitrary target speech rate in speech synthesis (TTS, text-to-speech) systems. The proposed method first constructs three fundamental duration models at "fast", "normal", and "slow" speech rates using Hayashi\´s quantification theory (type 1) based on real speech databases and creates a duration model according to a target speech rate by interpolating the fundamental models. Our TTS system uses an HMM-based synthesizer which can achieve flexible prosody control. Various speech synthesized by the proposed method is evaluated by subjective experiments at four speech rates using pair comparison tests between the proposed method and a rule-based method. The results show that the proposed method achieves higher naturalness in synthesized speech than the rule-based method.
  • Keywords
    hidden Markov models; natural languages; speech processing; speech synthesis; HMM-based synthesizer; Hayashi quantification theory; Japanese language; TTS; arbitrary target speech rate; duration models; flexible prosody control; interpolation; naturalness; phoneme duration control; real speech databases; speech synthesis; text-to-speech systems; Aging; Computer science; Control system synthesis; Databases; Hidden Markov models; Speech analysis; Speech synthesis; Synthesizers; Testing; Vocoders;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Speech Synthesis, 2002. Proceedings of 2002 IEEE Workshop on
  • Print_ISBN
    0-7803-7395-2
  • Type

    conf

  • DOI
    10.1109/WSS.2002.1224413
  • Filename
    1224413