DocumentCode :
178057
Title :
Non-linear dimension reduction of Gabor features for noise-robust ASR
Author :
Gupta, Hitesh Anand ; Raju, Athira ; Alwan, Abeer
Author_Institution :
Dept. of Electr. Eng., Univ. of California, Los Angeles, Los Angeles, CA, USA
fYear :
2014
fDate :
4-9 May 2014
Firstpage :
1715
Lastpage :
1719
Abstract :
It has been shown that Gabor filters closely resemble the spectro-temporal response fields of neurons in the primary auditory cortex. A filter bank of 2-D Gabor filters can be applied to either the mel-spectrogram or power normalized spectrogram to obtain a set of physiologically inspired Gabor Filter Bank Features. The high dimensionality and the correlated nature of these features pose an issue for ASR. In the past, dimension reduction was performed through (1) feature selection, (2) channel selection, (3) linear dimension reduction or (4) tandem acoustic modelling. In this paper, we propose a novel solution to this issue based on channel selection and non-linear dimension reduction using Laplacian Eigenmaps. These features are concatenated with Power Normalized Cepstral Coefficients (PNCC) to evaluate if the two are complementary and provide an improvement in performance. We show a relative reduction of 12.66% in the WER compared to the PNCC baseline, when applied to the Aurora 4 database.
Keywords :
Gabor filters; channel bank filters; eigenvalues and eigenfunctions; feature extraction; feature selection; speech recognition; 2D Gabor filters; Aurora 4 database; Gabor filter bank features; Laplacian eigenmaps; PNCC baseline; WER; automatic speech recognition; channel selection; feature selection; mel-spectrogram; noise-robust ASR; nonlinear dimension reduction; power normalized cepstral coefficients; power normalized spectrogram; primary auditory cortex; spectrotemporal response fields; tandem acoustic modelling; word error rate; Feature extraction; Hidden Markov models; Laplace equations; Robustness; Speech; Speech recognition; Vectors; Gabor filter-bank; Laplacian Eigenmaps; Multi-layer perceptron;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
Type :
conf
DOI :
10.1109/ICASSP.2014.6853891
Filename :
6853891
Link To Document :
بازگشت