DocumentCode
352326
Title
Transition-based speech synthesis using neural networks
Author
Corrigan, G. ; Massey, N. ; Schnurr, O.
Author_Institution
Motorola Labs., Schaumburg, IL, USA
Volume
2
fYear
2000
fDate
2000
Abstract
Prior attempts to use neural networks to synthesize speech from a phonetic representation have used the neural network to generate a frame of input to a vocoder. As this requires the neural network to compute one output for each frame of speech from the vocoder, this can be computationally expensive. An alternative implementation is to model the speech as a series of gestures, and let the neural network generate parameters describing the transitions of the vocoder parameters during these gestures. Experiments have shown that acceptable speech quality is produced when each gesture is half of a phonetic segment and the transition model is a set of cubic polynomials describing the variation of each vocoder parameter during the gesture. This results in a significant reduction in computational cost
Keywords
neural nets; polynomial approximation; speech synthesis; vocoders; computational cost reduction; cubic polynomials; neural networks; phonetic representation; phonetic segment; series of gestures; speech modeling; speech quality; transition model; transition-based speech synthesis; vocoder parameters; Computational efficiency; Computer networks; Equations; Mean square error methods; Network synthesis; Neural networks; Polynomials; Speech synthesis; Vocoders;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
Conference_Location
Istanbul
ISSN
1520-6149
Print_ISBN
0-7803-6293-4
Type
conf
DOI
10.1109/ICASSP.2000.859117
Filename
859117
Link To Document