Title :
Recent improvements on Microsoft´s trainable text-to-speech system-Whistler
Author :
Huang, X. ; Acero, A. ; Hon, H. ; Ju, Y. ; Liu, J. ; Meredith, S. ; Plumpe, M.
Author_Institution :
Microsoft Corp., Redmond, WA, USA
Abstract :
The Whistler text-to-speech engine was designed so that we can automatically construct the model parameters from training data. This paper focuses on the improvements on prosody and acoustic modeling, which are all derived through the use of probabilistic learning methods. Whistler can produce synthetic speech that sounds very natural and resembles the acoustic and prosodic characteristics of the original speaker. The underlying technologies used in Whistler can significantly facilitate the process of creating generic TTS systems for a new language, a new voice, or a new speech style. Whisper TTS engine supports Microsoft Speech API and requires less than 3 MB of working memory
Keywords :
acoustic signal processing; learning systems; probability; speech intelligibility; speech processing; speech synthesis; Microsoft Speech API; Microsoft trainable text to speech system; Whistler text to speech engine; acoustic characteristics; acoustic modeling; automatic model parameters construction; language; memory; natural speech; probabilistic learning methods; prosodic characteristics; prosody modeling; speech style; synthetic speech; training data; voice; Acoustic distortion; Engines; Loudspeakers; Man machine systems; Natural languages; Runtime; Speech processing; Speech synthesis; Synthesizers; Training data;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
Conference_Location :
Munich
Print_ISBN :
0-8186-7919-0
DOI :
10.1109/ICASSP.1997.596097