Title :
A comparison of hybrid HMM architecture using global discriminating training
Author :
Johansen, Finn Tore
Author_Institution :
Telenor Res. & Dev., Kjeller, Norway
Abstract :
This paper presents a comparison if different model architectures for TIMIT phoneme recognition. The baseline is a conventional diagonal covariance Gaussian mixture HMM. This system is compared to two different hybrid MLP/HMMs, both adhering to the same restrictions regarding input context and output states as the Gaussian mixtures. All free parameters in the three systems are jointly optimised using the same global discriminative criterion. A forward decoder, with total likelihood scoring, is used for recognition. While the global discriminative training method is found to improve the baseline HMM significantly, the differences between Gaussian and MLP-based architecture are small. The Gaussian mixture system however performs slightly better at the lowest complexity levels
Keywords :
feedforward neural nets; hidden Markov models; learning (artificial intelligence); maximum likelihood estimation; recurrent neural nets; speech recognition; Gaussian mixtures; TIMIT phoneme recognition; diagonal covariance Gaussian mixture HMM; forward decoder; global discriminating training; global discriminative criterion; global discriminative training method; hybrid HMM architecture; total likelihood scoring; Artificial neural networks; Hidden Markov models; Maximum likelihood decoding; Multilayer perceptrons; Recurrent neural networks; Research and development; Speech recognition; Stochastic processes; Viterbi algorithm; Vocabulary;
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
DOI :
10.1109/ICSLP.1996.607163