مرکز منطقه ای اطلاع رساني علوم و فناوري - Expressive speech synthesis using American English ToBI: questions and contrastive emphasis

DocumentCode :

3246471

Title :

Expressive speech synthesis using American English ToBI: questions and contrastive emphasis

Author :

Pitrelli, John E. ; Eide, Ellen M.

Author_Institution :

IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA

fYear :

2003

fDate :

30 Nov.-3 Dec. 2003

Firstpage :

694

Lastpage :

699

Abstract :

We describe American English concatenative text-to-speech synthesis experiments in which "expressions", namely questioning and contrastive emphasis, are each associated with a ToBI prosodic template. ToBI labels, along with text features, are in turn incorporated into decision-tree models of F0 and segment duration to be used during synthesis, sparing the need for expression-specific large corpora and decision trees. Synthesizing using this approach enables listeners to perform the difficult task of distinguishing yes-no questions from identically-worded declarative sentences 78% of the time, compared to the baseline system\´s 50%. For contrastive emphasis, a sentence is synthesized with emphasis on a word which is chosen appropriately or inappropriately based on a preceding sentence. Listeners\´ mean opinion scores for appropriate emphases exceed inappropriate by 0.40 on a 1-to-5 scale for the experimental system, compared to a difference of 0.11 for the baseline, a significant system difference (p<0.01).

Keywords :

decision trees; learning (artificial intelligence); linguistics; speech synthesis; American English prosodic template; concatenative text-to-speech synthesis; contrastive emphasis; decision trees; decision-tree models; expression-specific corpora; expressive speech synthesis; learning; linguistic framework; questioning; segment duration; Acoustics; Context modeling; Costs; Decision trees; Dictionaries; Humans; Large-scale systems; Speech processing; Speech synthesis; Synthesizers;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on

Print_ISBN :

0-7803-7980-2

Type :

conf

DOI :

10.1109/ASRU.2003.1318524

Filename :

1318524

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3246471