مرکز منطقه ای اطلاع رساني علوم و فناوري - Applying F0, duration and power models with microprosody components in text to speech (TTS) synthesis

DocumentCode :

1952717

Title :

Applying F0, duration and power models with microprosody components in text to speech (TTS) synthesis

Author :

Low, Phuay Hui ; Vaseghi, Saeed

Author_Institution :

Sch. of Inf. Syst., Comput. & Math., Brunel Univ., London

fYear :

2005

fDate :

8-10 June 2005

Firstpage :

229

Lastpage :

232

Abstract :

This paper proposes generic F0, duration and power models microprosodic components to be used in concatenative text-to-speech (TTS) synthesis. The proposed F0 and duration models also include a global component. The global component models the long-term intonation patterns in speech and the microprosody component models the sequential dependency of the acoustic correlates of speech. The microprosody model is based on a first-order Markovian model of biphone segments. Sentences are synthesised using the proposed F0, duration and power models gave an average MOS score of 3.76 which is 0.49 higher than that without the application of the models

Keywords :

Markov processes; speech synthesis; biphone segments; duration models; first-order Markovian model; global component; microprosody components; power models; text to speech synthesis; Design engineering; Information systems; Mathematical model; Mathematics; Power engineering and energy; Power engineering computing; Power system modeling; Speech analysis; Speech synthesis; Stress;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

ELMAR, 2005. 47th International Symposium

Conference_Location :

Zadar

Print_ISBN :

953-7044-01-4

Type :

conf

DOI :

10.1109/ELMAR.2005.193684

Filename :

1505685

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1952717