Title :
Advanced training methods and new network topologies for hybrid MMI-connectionist/HMM speech recognition systems
Author :
Neukirchen, Christoph ; Rigoll, Gerhard
Author_Institution :
Dept. of Comput. Sci., Gerhard-Mercator-Univ., Duisburg, Germany
Abstract :
This paper deals with the construction and optimization of a hybrid speech recognition system that consists of a combination of a neural vector quantizer (VQ) and discrete HMMs. In our investigations an integration of VQ based classification in the continuous classifier framework is given and some constraints are derived that must hold for the PDFs in the discrete pattern classifier context. Furthermore it is shown that for ML training of the whole system the VQ parameters must be estimated according to the maximum mutual information (MMI) criterion. A novel training method based on gradient search for neural networks that serve as optimal VQ is derived. This allows faster training of arbitrary network topologies compared to the traditional MMI-NN training. An integration of multilayer MMI-NNs as the VQ in the hybrid discrete HMM based speech recognizer leads to a large improvement compared to other supervised and unsupervised single layer VQ systems. For the speaker independent Resource Management database the constructed hybrid MMI-connectionist/HMM system achieves recognition rates that are comparable to traditional sophisticated continuous PDF HMM systems
Keywords :
hidden Markov models; learning (artificial intelligence); maximum likelihood estimation; network topology; neural nets; pattern classification; search problems; speech processing; speech recognition; vector quantisation; HMM speech recognition systems; ML training; PDF; VQ; continuous classifier; discrete HMM; discrete pattern classifier; gradient search; hybrid MMI-connectionist/HMM system; maximum mutual information; multilayer MMI-NN; network topologies; neural vector quantizer; parameter estimation; recognition rates; speaker independent Resource Management database; training methods; Hidden Markov models; Management training; Maximum likelihood estimation; Multi-layer neural network; Mutual information; Network topology; Neural networks; Parameter estimation; Resource management; Speech recognition;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
Conference_Location :
Munich
Print_ISBN :
0-8186-7919-0
DOI :
10.1109/ICASSP.1997.595488