DocumentCode :
3464861
Title :
High performance telephony speech recognition via cascade HMM/ANN hybrid
Author :
Gholampour, Iman ; Nayebi, Kambiz
Author_Institution :
Dept. of Electr. Eng., Sharif Univ. of Technol., Tehran, Iran
Volume :
2
fYear :
1999
fDate :
1999
Firstpage :
645
Abstract :
A new formulation for discriminative training of HMMs is introduced as a solution to telephony speech recognition problem. This formulation uses a properly trained MLP in a simple interconnection with HMMs called “cascade HMM/ANN hybrid”. Our training algorithm has a simple realization in comparison with other discriminative training for HMMs such as MDI and MMI. We also present a rigid mathematical proof of its convergence. We found that using cascade HMM/ANN for telephony isolated word recognition results in increasing the recognition accuracy from 88.1% in classic HMMs to 98.1% using a two layer multilayer perceptron (MLP). This structure also reveals better robustness to ending point positions of the words in the presence of background noise, particularly for mobile telephone calls. No significant increase in computational requirements is needed in the recognition phase and the recognition task can still be performed in real-time. Both theoretical and experimental results are included in the paper
Keywords :
convergence of numerical methods; hidden Markov models; land mobile radio; learning (artificial intelligence); multilayer perceptrons; radiotelephony; speech recognition; HMM interconnection; MDI; MMI; background noise; cascade HMM/ANN hybrid; computational requirements; convergence; discriminative training; ending point positions; experimental results; hidden Markov models; high performance telephony speech recognition; isolated word recognition; mathematical proof; mobile telephone calls; real-time recognition; recognition accuracy; trained MLP; training algorithm; two layer multilayer perceptron; Artificial neural networks; Automatic speech recognition; Background noise; Convergence; Hidden Markov models; Multilayer perceptrons; Noise robustness; Speech recognition; State estimation; Telephony;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing and Its Applications, 1999. ISSPA '99. Proceedings of the Fifth International Symposium on
Conference_Location :
Brisbane, Qld.
Print_ISBN :
1-86435-451-8
Type :
conf
DOI :
10.1109/ISSPA.1999.815755
Filename :
815755
Link To Document :
بازگشت