مرکز منطقه ای اطلاع رساني علوم و فناوري - Non-linear dimension reduction of Gabor features for noise-robust ASR

DocumentCode :

178057

Title :

Non-linear dimension reduction of Gabor features for noise-robust ASR

Author :

Gupta, Hitesh Anand ; Raju, Athira ; Alwan, Abeer

Author_Institution :

Dept. of Electr. Eng., Univ. of California, Los Angeles, Los Angeles, CA, USA

fYear :

2014

fDate :

4-9 May 2014

Firstpage :

1715

Lastpage :

1719

Abstract :

It has been shown that Gabor filters closely resemble the spectro-temporal response fields of neurons in the primary auditory cortex. A filter bank of 2-D Gabor filters can be applied to either the mel-spectrogram or power normalized spectrogram to obtain a set of physiologically inspired Gabor Filter Bank Features. The high dimensionality and the correlated nature of these features pose an issue for ASR. In the past, dimension reduction was performed through (1) feature selection, (2) channel selection, (3) linear dimension reduction or (4) tandem acoustic modelling. In this paper, we propose a novel solution to this issue based on channel selection and non-linear dimension reduction using Laplacian Eigenmaps. These features are concatenated with Power Normalized Cepstral Coefficients (PNCC) to evaluate if the two are complementary and provide an improvement in performance. We show a relative reduction of 12.66% in the WER compared to the PNCC baseline, when applied to the Aurora 4 database.

Keywords :

Gabor filters; channel bank filters; eigenvalues and eigenfunctions; feature extraction; feature selection; speech recognition; 2D Gabor filters; Aurora 4 database; Gabor filter bank features; Laplacian eigenmaps; PNCC baseline; WER; automatic speech recognition; channel selection; feature selection; mel-spectrogram; noise-robust ASR; nonlinear dimension reduction; power normalized cepstral coefficients; power normalized spectrogram; primary auditory cortex; spectrotemporal response fields; tandem acoustic modelling; word error rate; Feature extraction; Hidden Markov models; Laplace equations; Robustness; Speech; Speech recognition; Vectors; Gabor filter-bank; Laplacian Eigenmaps; Multi-layer perceptron;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on

Conference_Location :

Florence

Type :

conf

DOI :

10.1109/ICASSP.2014.6853891

Filename :

6853891

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=178057