Title :
A study on hidden Markov model´s generalization capability for speech recognition
Author :
Xiao, Xiong ; Li, Jinyu ; Chng, Eng Siong ; Li, Haizhou ; Lee, Chin-Hui
Author_Institution :
Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore, Singapore
fDate :
Nov. 13 2009-Dec. 17 2009
Abstract :
From statistical learning theory, the generalization capability of a model is the ability to generalize well on unseen test data which follow the same distribution as the training data. This paper investigates how generalization capability can also improve robustness when testing and training data are from different distributions in the context of speech recognition. Two discriminative training (DT) methods are used to train the hidden Markov model (HMM) for better generalization capability, namely the minimum classification error (MCE) and the soft-margin estimation (SME) methods. Results on Aurora-2 task show that both SME and MCE are effective in improving one of the measures of acoustic model´s generalization capability, i.e. the margin of the model, with SME be moderately more effective. In addition, the better generalization capability translates into better robustness of speech recognition performance, even when there is significant mismatch between the training and testing data. We also applied the mean and variance normalization (MVN) to preprocess the data to reduce the training-testing mismatch. After MVN, MCE and SME perform even better as the generalization capability now is more closely related to robustness. The best performance on Aurora-2 is obtained from SME and about 28% relative error rate reduction is achieved over the MVN baseline system. Finally, we also use SME to demonstrate the potential of better generalization capability in improving robustness in more realistic noisy task using the Aurora-3 task, and significant improvements are obtained.
Keywords :
error statistics; estimation theory; generalisation (artificial intelligence); hidden Markov models; speech recognition; statistical analysis; Aurora-2 task; Aurora-3 task; MVN baseline system; SME; acoustic model; discriminative training methods; generalization capability; hidden Markov model; mean and variance normalization; minimum classification error; relative error rate reduction; soft-margin estimation methods; speech recognition; statistical learning theory; training-testing mismatch; Acoustic testing; Data engineering; Distributed computing; Error analysis; Hidden Markov models; Noise robustness; Performance evaluation; Speech recognition; Statistical learning; Training data; Aurora task; minimum classification error; model generalization; robustness; soft margin estimation;
Conference_Titel :
Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on
Conference_Location :
Merano
Print_ISBN :
978-1-4244-5478-5
Electronic_ISBN :
978-1-4244-5479-2
DOI :
10.1109/ASRU.2009.5373359