Title :
Non-linear input transformations for discriminative HMMs
Author :
Johansen, Finn Tore ; Johnsen, Magne Hallstein
Author_Institution :
Dept of Telecommun., Inst. of Technol., Trondheim, Norway
Abstract :
This paper deals with speaker-independent continuous speech recognition. Our approach is based on continuous density hidden Markov models with a non-linear input feature transformation performed by a multilayer perceptron. We discuss various optimisation criteria and provide results on a TIMIT phoneme recognition task, using single frame (mutual information or relative entropy) MMI embedded in Viterbi training, and a global MMI criterion. As expected, global MMI is found superior to the frame-based criterion for continuous recognition. We further observe that optimal sentence decoding is essential to achieve maximum recognition rate for models trained by global MMI. Finally, we find that the simple MLP input transformation, with five frames of context information, can increase the recognition rate significantly compared to just using delta parameters
Keywords :
decoding; entropy; hidden Markov models; learning (artificial intelligence); maximum likelihood estimation; multilayer perceptrons; optimisation; speech recognition; MLP input transformation; TIMIT phoneme recognition; Viterbi training; continuous density hidden Markov models; discriminative HMM; global MMI; maximum likelihood estimation; maximum recognition rate; multilayer perceptron; mutual information; nonlinear input feature transformation; optimal sentence decoding; optimisation criteria; relative entropy; speaker-independent continuous speech recognition; Artificial neural networks; Detectors; Hidden Markov models; Maximum likelihood decoding; Maximum likelihood detection; Maximum likelihood estimation; Multilayer perceptrons; Speech processing; Speech recognition; Viterbi algorithm;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on
Conference_Location :
Adelaide, SA
Print_ISBN :
0-7803-1775-0
DOI :
10.1109/ICASSP.1994.389314