DocumentCode :
294644
Title :
A prosodic model of Mandarin speech and its application to pitch level generation for text-to-speech
Author :
Shaw-Hwa Hwang ; Sin-Horng Chen
Author_Institution :
Dept. of Commun. Eng., Nat. Chiao Tung Univ., Hsinchu
Volume :
1
fYear :
1995
fDate :
9-12 May 1995
Firstpage :
616
Abstract :
A prosodic model of Mandarin speech is proposed to simulate human´s pronunciation mechanism for exploring the hidden pronunciation states embedded in the input text. Parameters representing these pronunciation states are then used to assist prosody information generation. A multirate recurrent neural network (MRNN) is employed to realize the prosodic model. Two learning methods were proposed to train the MRNN. One is an indirect method which firstly uses an additional SRNN to track the dynamics of the prosody information of the utterance; and then takes the outputs of its hidden layer as desired targets to train the MRNN. The other is a direct training method which integrates the MRNN and the following MLP prosody synthesizers to directly learn the relation between the input linguistic features and the output prosody information. Simulation results confirmed the effectiveness of the approach. Most synthesized prosodic parameter sequences match quite well with their original counterparts
Keywords :
backpropagation; multilayer perceptrons; recurrent neural nets; speech synthesis; MLP prosody synthesizers; Mandarin speech; hidden pronunciation states; input linguistic features; input text; learning methods; multirate recurrent neural network; parameter sequences; pitch level generation; pronunciation mechanism; prosodic model; prosody information generation; simulation; text-to-speech; training; utterance; Brain modeling; Councils; Frequency; Indium phosphide; Information analysis; Integrated circuit modeling; Learning systems; Recurrent neural networks; Spatial databases; Speech synthesis; Target tracking; Telecommunication control;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on
Conference_Location :
Detroit, MI
ISSN :
1520-6149
Print_ISBN :
0-7803-2431-5
Type :
conf
DOI :
10.1109/ICASSP.1995.479673
Filename :
479673
Link To Document :
بازگشت