DocumentCode :
2180494
Title :
Performance of connected digit recognizers with context-dependent word duration modeling
Author :
Kwon, Oh Wook ; Un, Chong Kwan
Author_Institution :
Spoken Language Processing Sect., ETRI, Taejon, South Korea
fYear :
1996
fDate :
18-21 Nov 1996
Firstpage :
243
Lastpage :
246
Abstract :
In a Korean connected digit recognizer, insertion and deletion errors amount to about half of the total recognition errors because there exists two monophonemic digits in the Korean language. Previous studies showed that these errors are not corrected even by discriminative training algorithms. To reduce those errors, we propose to model and incorporate context-dependent word duration information directly in a decoding algorithm. Experimental results show that while incorporating duration information in the postprocessing stage does not achieve significant improvements over a baseline system, the proposed method reduces word error rates by as much as 10% for unknown length decoding when the recognizer is trained by the maximum likelihood estimation and generalized probabilistic descent methods. Further simple duration modeling by a bounded uniform distribution shows it is possible to achieve performance improvements comparable to detailed duration modeling by a gamma or Gaussian distribution, and hence it is a good compromise between performance and complexity
Keywords :
Gaussian distribution; decoding; errors; gamma distribution; maximum likelihood estimation; probability; speech coding; speech recognition; Gaussian distribution; Korean language; bounded uniform distribution; connected digit recognizers; context-dependent word duration modeling; decoding algorithm; deletion errors; duration information; gamma distribution; generalized probabilistic descent method; insertion errors; maximum likelihood estimation; monophonemic digits; postprocessing stage; recognition errors; word error rates; Context modeling; Error analysis; Error correction; Gaussian distribution; Hidden Markov models; Maximum likelihood decoding; Natural languages; Pattern recognition; Probability distribution; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Circuits and Systems, 1996., IEEE Asia Pacific Conference on
Conference_Location :
Seoul
Print_ISBN :
0-7803-3702-6
Type :
conf
DOI :
10.1109/APCAS.1996.569264
Filename :
569264
Link To Document :
بازگشت