Title :
Audio-visual speech recognition using minimum classification error training
Author :
Miyajima, Chiyomi ; Tokuda, Keiichi ; Kitamura, Tadashi
Author_Institution :
Dept. of Comput. Sci., Nagoya Inst. of Technol., Japan
Abstract :
Presents a framework for designing a hidden Markov model (HMM)-based audio-visual automatic speech recognition system based on minimum classification error (MCE) training. Audio/visual HMMs are optimized with MCE training based on the generalized probabilistic descent (GPD) method, and their likelihoods are combined using model-dependent stream weights which are also estimated with the GPD method. Experimental results of speaker-independent isolated word recognition show that the GPD optimization of the audio/visual HMMs and the use of GPD-based model-dependent stream weights provide a significant improvement in system performance, leading to a 47%-81% error reduction over a conventional system which consists of HMMs trained based on the maximum likelihood criterion and globally-tied stream weights estimated with the GPD method
Keywords :
audio-visual systems; errors; gradient methods; hidden Markov models; learning (artificial intelligence); minimisation; signal classification; speech recognition; audiovisual speech recognition system; error reduction; generalized probabilistic descent method; globally-tied stream weights; hidden Markov model; likelihood combination; maximum likelihood criterion; minimum classification error training; model-dependent stream weight estimation; optimization; speaker-independent isolated word recognition; system performance; Automatic speech recognition; Computer errors; Computer science; Electronic mail; Hidden Markov models; Maximum likelihood estimation; Optimization methods; Speech recognition; Streaming media; System performance;
Conference_Titel :
Neural Networks for Signal Processing X, 2000. Proceedings of the 2000 IEEE Signal Processing Society Workshop
Conference_Location :
Sydney, NSW
Print_ISBN :
0-7803-6278-0
DOI :
10.1109/NNSP.2000.889354