• DocumentCode
    3124038
  • Title

    A simple and effective pitch re-estimation method for rich prosody and speaking styles in HMM-based speech synthesis

  • Author

    Cheng-Yuan Lin ; Chien-Hung Huang ; Chih-Chung Kuo

  • Author_Institution
    ITRI, Hsinchu, Taiwan
  • fYear
    2012
  • fDate
    5-8 Dec. 2012
  • Firstpage
    286
  • Lastpage
    290
  • Abstract
    This paper proposes a novel way of controllable pitch re-estimation that can produce better pitch contour or provide diverse speaking styles for text-to-speech (TTS) systems. The method is composed of a pitch re-estimation model and a set of control parameters. The pitch re-estimation model is employed to reduce over-smoothing effects which is usually introduced by TTS training. The control parameters are designed to generate not only rich intonations but also speaking styles, e.g. a foreign accent or an excited tone. To verify the feasibility of the proposed method, we conducted experiments for both objective measures and subjective tests. Although the re-estimated pitch results in only slightly less prediction error for objective measure, it produces clearly better intonation for listening test. Moreover, the expressive speech can be generated successfully under the framework of controllable pitch re-estimation.
  • Keywords
    hidden Markov models; speech synthesis; HMM-based speech synthesis; TTS systems; TTS training; controllable pitch reestimation; effective pitch reestimation method; listening test; pitch reestimation model; prediction error; rich prosody styles; speaking styles; text-to-speech systems; Estimation; Hidden Markov models; Robots; Shape; Speech; Speech synthesis; Training; Pitch re-estimation; expressive speech; prosody control; text-to-speech;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on
  • Conference_Location
    Kowloon
  • Print_ISBN
    978-1-4673-2506-6
  • Electronic_ISBN
    978-1-4673-2505-9
  • Type

    conf

  • DOI
    10.1109/ISCSLP.2012.6423473
  • Filename
    6423473