DocumentCode :
863235
Title :
Discriminative auditory-based features for robust speech recognition
Author :
Mak, Brian Kan-Wing ; Tam, Yik-Cheung ; Li, Peter Qi
Author_Institution :
Dept. of Comput. Sci., Hong Kong Univ. of Sci. & Technol., China
Volume :
12
Issue :
1
fYear :
2004
Firstpage :
27
Lastpage :
36
Abstract :
Recently, a new auditory-based feature extraction algorithm for robust speech recognition in noisy environments was proposed. The new features are derived by mimicking closely the human peripheral auditory process and the filters in the outer ear, middle ear, and inner ear are obtained from psychoacoustics literature with some manual adjustments. In this paper, we extend the auditory-based feature extraction algorithm and propose to further train the auditory-based filters through discriminative training. Using the data-driven approach, we optimize the filters by minimizing the subsequent recognition errors on a task. One significant contribution over similar efforts in the past (generally under the name of "discriminative feature extraction") is that we make no assumption on the parametric form of the auditory-based filters. Instead, we only require the filters to be triangular-like: the filter weights have a maximum value in the middle and then monotonically decrease to both ends. Discriminative training of these constrained auditory-based filters leads to improved performance. Furthermore, we study the combined discriminative training procedure for both feature and acoustic model parameters. Our experiments show that the best performance can be obtained in a sequential procedure under the unified framework of MCE/GPD.
Keywords :
acoustic filters; feature extraction; hearing; optimisation; parameter estimation; speech recognition; auditory-based filters; discriminative auditory-based features; discriminative feature extraction; discriminative training; feature extraction algorithm; generalized probabilistic descent; human peripheral auditory process; minimum classification error; noisy environment; psychoacoustics; recognition errors; robust speech recognition; Automatic speech recognition; Ear; Feature extraction; Filters; Hidden Markov models; Mathematical model; Psychoacoustic models; Robustness; Speech recognition; Working environment noise;
fLanguage :
English
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6676
Type :
jour
DOI :
10.1109/TSA.2003.819951
Filename :
1261269
Link To Document :
بازگشت