DocumentCode
323838
Title
An MRNN-based method for continuous Mandarin speech recognition
Author
Liao, Yuan-Fu ; Chen, Sin-Horng
Author_Institution
Dept. of Commun. Eng., Nat. Chiao Tung Univ., Hsinchu, Taiwan
Volume
2
fYear
1998
fDate
12-15 May 1998
Firstpage
1121
Abstract
A new modular recurrent neural network (MRNN)-based method for continuous Mandarin speech recognition is proposed. The system uses five RNNs to accomplish many subtasks separately and then combine them to integrally solve the problem. They include two RNNs for the discrimination of the two sub-syllable groups of 100 right-final-dependent (RFD) initials and 39 context independent (CI) finals, two RNNs for the generation of dynamic weighting functions for sub-syllable´s integration, and one RNN for syllable boundary detection. All RNN modules are combined using a delay-decision Viterbi search. The method differs from the ANN/HMM hybrid approach of using ANNs to perform not only sub-syllables discrimination but also temporal structure modeling of the speech signal. The system is trained using a three-stage training method embedding with the MCE/GPD algorithms. Besides, a fast recognition method using multi-level pruning is also proposed. Experimental results showed that it outperforms the HMM method on both the recognition accuracy and the computational complexity
Keywords
backpropagation; computational complexity; natural languages; neural net architecture; recurrent neural nets; search problems; speech processing; speech recognition; ANN/HMM hybrid approach; CI finals; MCE/GPD algorithms; MRNN-based method; RFD initials; RNN modules; computational complexity; context independent finals; continuous Mandarin speech recognition; delay-decision Viterbi search; dynamic weighting functions; error backpropagation algorithm; experimental results; fast recognition method; modular recurrent neural network; multi-level pruning; neural network architecture; recognition accuracy; right-final-dependent initials; speech signal; sub-syllable groups discrimination; sub-syllable integration; syllable boundary detection; temporal structure modeling; three-stage training method; Artificial neural networks; Computational complexity; Contracts; Councils; Delay; Error analysis; Hidden Markov models; Recurrent neural networks; Speech recognition; Viterbi algorithm;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
Conference_Location
Seattle, WA
ISSN
1520-6149
Print_ISBN
0-7803-4428-6
Type
conf
DOI
10.1109/ICASSP.1998.675466
Filename
675466
Link To Document