• DocumentCode
    3705094
  • Title

    Analysis and modeling pauses for synthesis of storytelling speech based on discourse modes

  • Author

    Parakrant Sarkar;K. Sreenivasa Rao

  • Author_Institution
    School of Information Technology, Indian Institute of Technology Kharagpur, 721302, West Bengal, India
  • fYear
    2015
  • Firstpage
    225
  • Lastpage
    230
  • Abstract
    Generally in Text-to-Speech synthesis (TTS) systems, pause prediction plays a vital role in synthesizing natural and expressive speech. In storytelling style, pauses introduce suspense and climax by emphasizing the prominent words or emotion-salient words in a story. The objective of this work is to analyze and model the pause pattern to capture the story-semantic information. The purpose of this paper is to define a stepping stone towards developing a Story TTS based on modes of discourse. In this work, we base our analysis of the pauses in Hindi children stories for each mode of discourse: narrative, descriptive and dialogue. After grouping the sentences into modes, we analyse the pause pattern to capture the story-semantic information. A three stage data-driven method is proposed to predict the location and duration of pauses for each mode. Both the objective as well as subjective test are conducted to evaluate the performance of the proposed method. The subjective evaluation indicates that subjects appreciates the quality of synthesized speech by incorporating the proposed model.
  • Keywords
    "Hidden Markov models","Yttrium","Manuals","Context modeling"
  • Publisher
    ieee
  • Conference_Titel
    Contemporary Computing (IC3), 2015 Eighth International Conference on
  • Print_ISBN
    978-1-4673-7947-2
  • Type

    conf

  • DOI
    10.1109/IC3.2015.7346683
  • Filename
    7346683