Title :
Applying F0, duration and power models with microprosody components in text to speech (TTS) synthesis
Author :
Low, Phuay Hui ; Vaseghi, Saeed
Author_Institution :
Sch. of Inf. Syst., Comput. & Math., Brunel Univ., London
Abstract :
This paper proposes generic F0, duration and power models microprosodic components to be used in concatenative text-to-speech (TTS) synthesis. The proposed F0 and duration models also include a global component. The global component models the long-term intonation patterns in speech and the microprosody component models the sequential dependency of the acoustic correlates of speech. The microprosody model is based on a first-order Markovian model of biphone segments. Sentences are synthesised using the proposed F0, duration and power models gave an average MOS score of 3.76 which is 0.49 higher than that without the application of the models
Keywords :
Markov processes; speech synthesis; biphone segments; duration models; first-order Markovian model; global component; microprosody components; power models; text to speech synthesis; Design engineering; Information systems; Mathematical model; Mathematics; Power engineering and energy; Power engineering computing; Power system modeling; Speech analysis; Speech synthesis; Stress;
Conference_Titel :
ELMAR, 2005. 47th International Symposium
Conference_Location :
Zadar
Print_ISBN :
953-7044-01-4
DOI :
10.1109/ELMAR.2005.193684