DocumentCode :
3705094
Title :
Analysis and modeling pauses for synthesis of storytelling speech based on discourse modes
Author :
Parakrant Sarkar;K. Sreenivasa Rao
Author_Institution :
School of Information Technology, Indian Institute of Technology Kharagpur, 721302, West Bengal, India
fYear :
2015
Firstpage :
225
Lastpage :
230
Abstract :
Generally in Text-to-Speech synthesis (TTS) systems, pause prediction plays a vital role in synthesizing natural and expressive speech. In storytelling style, pauses introduce suspense and climax by emphasizing the prominent words or emotion-salient words in a story. The objective of this work is to analyze and model the pause pattern to capture the story-semantic information. The purpose of this paper is to define a stepping stone towards developing a Story TTS based on modes of discourse. In this work, we base our analysis of the pauses in Hindi children stories for each mode of discourse: narrative, descriptive and dialogue. After grouping the sentences into modes, we analyse the pause pattern to capture the story-semantic information. A three stage data-driven method is proposed to predict the location and duration of pauses for each mode. Both the objective as well as subjective test are conducted to evaluate the performance of the proposed method. The subjective evaluation indicates that subjects appreciates the quality of synthesized speech by incorporating the proposed model.
Keywords :
"Hidden Markov models","Yttrium","Manuals","Context modeling"
Publisher :
ieee
Conference_Titel :
Contemporary Computing (IC3), 2015 Eighth International Conference on
Print_ISBN :
978-1-4673-7947-2
Type :
conf
DOI :
10.1109/IC3.2015.7346683
Filename :
7346683
Link To Document :
بازگشت