• DocumentCode
    701809
  • Title

    Data-driven pause prediction for speech synthesis in storytelling style speech

  • Author

    Sarkar, Parakrant ; Rao, K. Sreenivasa

  • Author_Institution
    Sch. of Inf. Technol., Indian Inst. of Technol. Kharagpur, Kharagpur, India
  • fYear
    2015
  • fDate
    Feb. 27 2015-March 1 2015
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    In the storyteller speech, pauses plays a significant role in introducing suspense and climax. Pauses are used to emphasize keywords, emotion-salient words and separate the phrases in the utterance. The objective of this work is to predict the position and duration of the pauses in the synthesized speech from the text-to-speech system. We analyzed the pause patterns in storyteller speech and classified the pauses into three different categories, that is, short, medium and long pauses. A data driven three stage pause prediction model is proposed. In the first stage, the model is built properly to identify the pause position within an utterance using a set of word-level features. In the second stage, the pauses are classified into three different categories using a set of syllable-level features. In the final stage, a regression predictor is trained to predict the pause duration for each category. We conducted both objective and subjective tests to evaluate the proposed method. The subjective evaluation showed that subjects are perceiving a noticeable difference in the synthesized speech using the proposed method.
  • Keywords
    natural language processing; speech synthesis; climax; data-driven pause prediction; emotion-salient words; keywords; phrases; prediction model; regression predictor; storyteller speech synthesis; storytelling style speech; suspense; syllable-level features; text-to-speech system; utterance; Accuracy; Buildings; Feature extraction; Hidden Markov models; Predictive models; Speech; Speech synthesis; Breaks; Non-break; Pause Duration; Pause prediction; Phrasing; Speech synthesis; Storytelling style; silences;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communications (NCC), 2015 Twenty First National Conference on
  • Conference_Location
    Mumbai
  • Type

    conf

  • DOI
    10.1109/NCC.2015.7084924
  • Filename
    7084924