Title :
A high quality text-to-speech system composed of multiple neural networks
Author :
Karaali, Orhan ; Corrigan, Gerald ; Massey, Noel ; Miller, Corey ; Schnurr, Otto ; Mackie, Andrew
Author_Institution :
Speech Process. Lab., Motorola Inc., Schaumburg, IL, USA
Abstract :
While neural networks have been employed to handle several different text-to-speech tasks, ours is the first system to use neural networks throughout, for both linguistic and acoustic processing. We divide the text-to-speech task into three subtasks, a linguistic module mapping from text to a linguistic representation, an acoustic module mapping from the linguistic representation to speech, and a video module mapping from the linguistic representation to animated images. The linguistic module employs a letter-to-sound neural network and postlexical neural network. The acoustic module employs a duration neural network and a phonetic neural network. The visual neural network is employed in parallel to the acoustic module to drive a talking head. The use of neural networks that can be retrained on the characteristics of different voices and languages affords our system a degree of adaptability and naturalness heretofore unavailable
Keywords :
acoustic signal processing; natural languages; neural nets; signal representation; speech intelligibility; speech synthesis; video signal processing; acoustic module; acoustic processing; animated images; duration neural network; high quality text-to-speech system; letter-to-sound neural network; linguistic module; linguistic processing; linguistic representation; multiple neural networks; phonetic neural network; postlexical neural network; speech intelligibility; subtasks; talking head; video module; visual neural network; Databases; Dictionaries; Head; Natural languages; Neural networks; Recurrent neural networks; Speech processing; Speech synthesis; Stress; Synthesizers;
Conference_Titel :
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
Conference_Location :
Seattle, WA
Print_ISBN :
0-7803-4428-6
DOI :
10.1109/ICASSP.1998.675495