Whistler: a trainable text-to-speech system

Author

Xuedong Huang ; Acero, Alex ; Adcock, Jim ; Hon, Hsiao- Wuen ; Goldsmith, John ; Liu, Jingsong ; Plumpe, Mike

Author_Institution

Microsoft Corp., Redmond, WA, USA

Volume

4

fYear

1996

fDate

3-6 Oct 1996

Firstpage

2387

Abstract

We introduce Whistler, a trainable text to speech (TTS) system that automatically learns the model parameters from a corpus. Both prosody parameters and concatenative speech units are derived through the use of probabilistic learning methods that have been successfully used for speech recognition. Whistler can produce synthetic speech that sounds very natural and resembles the acoustic and prosodic characteristics of the original speaker. The underlying technologies used in Whistler can significantly facilitate the process of creating generic TTS systems for a new language, a new voice, or a new speech style

Keywords

learning (artificial intelligence); natural language interfaces; natural languages; probability; speech processing; speech synthesis; Whistler; concatenative speech units; generic TTS systems; model parameters; probabilistic learning methods; prosodic characteristics; prosody parameters; speech style; synthetic speech; trainable text to speech system; Learning systems; Loudspeakers; Natural language processing; Natural languages; Runtime; Speech analysis; Speech processing; Speech recognition; Speech synthesis; Synthesizers;

fLanguage

English

Publisher

ieee

Conference_Titel

Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on

Conference_Location

Philadelphia, PA

Print_ISBN

0-7803-3555-4

Type

conf

DOI

10.1109/ICSLP.1996.607289

Filename

607289