DocumentCode :
311014
Title :
Inter-digit HMM: connected digit recognition using the Macrophone corpus
Author :
Kao, Yu-Hung ; Netsch, Lorin
Author_Institution :
Texas Instrum. Inc., Dallas, TX, USA
Volume :
3
fYear :
1997
fDate :
21-24 Apr 1997
Firstpage :
1739
Abstract :
Continuous digit recognition over the telephone channel is a key technology for many telecommunications applications such as voice dialing, automatic banking, and credit card number entry. Speech recognizers usually achieve high performance by modeling the acoustics in hidden Markov models (HMMs) using large numbers of multivariate Gaussian mixtures with assumed diagonal covariance in order to model the variability of different speakers and channel conditions. We present a system that uses single mixture 16 feature Gaussian distribution with an assumed identity covariance to achieve a 1.0% word error and a 5.7% sentence error rate on the Macrophone corpus. We found that inter-digit modeling, discriminant training, and per-utterance adaptation can each contribute about a 30% reduction in the error rate. Using this approach, we can realize a system with relatively low memory requirements
Keywords :
Gaussian distribution; acoustic signal processing; hidden Markov models; speech processing; speech recognition; telecommunication channels; telephony; Gaussian distributions; Macrophone corpus; acoustics modeling; automatic banking; channel conditions; connected digit recognition; credit card number entry; diagonal covariance; discriminant training; error rate reduction; hidden Markov models; identity covariance; interdigit HMM; interdigit modeling; multivariate Gaussian mixtures; per-utterance adaptation; sentence error rate; speech recognizers; telecommunications applications; telephone channel; voice dialing; word error rate; Acoustics; Automatic speech recognition; Banking; Credit cards; Error analysis; Gaussian distribution; Hidden Markov models; Loudspeakers; Speech recognition; Telephony;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
Conference_Location :
Munich
ISSN :
1520-6149
Print_ISBN :
0-8186-7919-0
Type :
conf
DOI :
10.1109/ICASSP.1997.598860
Filename :
598860
Link To Document :
بازگشت