• DocumentCode
    323838
  • Title

    An MRNN-based method for continuous Mandarin speech recognition

  • Author

    Liao, Yuan-Fu ; Chen, Sin-Horng

  • Author_Institution
    Dept. of Commun. Eng., Nat. Chiao Tung Univ., Hsinchu, Taiwan
  • Volume
    2
  • fYear
    1998
  • fDate
    12-15 May 1998
  • Firstpage
    1121
  • Abstract
    A new modular recurrent neural network (MRNN)-based method for continuous Mandarin speech recognition is proposed. The system uses five RNNs to accomplish many subtasks separately and then combine them to integrally solve the problem. They include two RNNs for the discrimination of the two sub-syllable groups of 100 right-final-dependent (RFD) initials and 39 context independent (CI) finals, two RNNs for the generation of dynamic weighting functions for sub-syllable´s integration, and one RNN for syllable boundary detection. All RNN modules are combined using a delay-decision Viterbi search. The method differs from the ANN/HMM hybrid approach of using ANNs to perform not only sub-syllables discrimination but also temporal structure modeling of the speech signal. The system is trained using a three-stage training method embedding with the MCE/GPD algorithms. Besides, a fast recognition method using multi-level pruning is also proposed. Experimental results showed that it outperforms the HMM method on both the recognition accuracy and the computational complexity
  • Keywords
    backpropagation; computational complexity; natural languages; neural net architecture; recurrent neural nets; search problems; speech processing; speech recognition; ANN/HMM hybrid approach; CI finals; MCE/GPD algorithms; MRNN-based method; RFD initials; RNN modules; computational complexity; context independent finals; continuous Mandarin speech recognition; delay-decision Viterbi search; dynamic weighting functions; error backpropagation algorithm; experimental results; fast recognition method; modular recurrent neural network; multi-level pruning; neural network architecture; recognition accuracy; right-final-dependent initials; speech signal; sub-syllable groups discrimination; sub-syllable integration; syllable boundary detection; temporal structure modeling; three-stage training method; Artificial neural networks; Computational complexity; Contracts; Councils; Delay; Error analysis; Hidden Markov models; Recurrent neural networks; Speech recognition; Viterbi algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
  • Conference_Location
    Seattle, WA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-4428-6
  • Type

    conf

  • DOI
    10.1109/ICASSP.1998.675466
  • Filename
    675466