مرکز منطقه ای اطلاع رساني علوم و فناوري - Discriminative auditory-based features for robust speech recognition

DocumentCode :

863235

Title :

Discriminative auditory-based features for robust speech recognition

Author :

Mak, Brian Kan-Wing ; Tam, Yik-Cheung ; Li, Peter Qi

Author_Institution :

Dept. of Comput. Sci., Hong Kong Univ. of Sci. & Technol., China

Volume :

Issue :

fYear :

2004

Firstpage :

Lastpage :

Abstract :

Recently, a new auditory-based feature extraction algorithm for robust speech recognition in noisy environments was proposed. The new features are derived by mimicking closely the human peripheral auditory process and the filters in the outer ear, middle ear, and inner ear are obtained from psychoacoustics literature with some manual adjustments. In this paper, we extend the auditory-based feature extraction algorithm and propose to further train the auditory-based filters through discriminative training. Using the data-driven approach, we optimize the filters by minimizing the subsequent recognition errors on a task. One significant contribution over similar efforts in the past (generally under the name of "discriminative feature extraction") is that we make no assumption on the parametric form of the auditory-based filters. Instead, we only require the filters to be triangular-like: the filter weights have a maximum value in the middle and then monotonically decrease to both ends. Discriminative training of these constrained auditory-based filters leads to improved performance. Furthermore, we study the combined discriminative training procedure for both feature and acoustic model parameters. Our experiments show that the best performance can be obtained in a sequential procedure under the unified framework of MCE/GPD.

Keywords :

acoustic filters; feature extraction; hearing; optimisation; parameter estimation; speech recognition; auditory-based filters; discriminative auditory-based features; discriminative feature extraction; discriminative training; feature extraction algorithm; generalized probabilistic descent; human peripheral auditory process; minimum classification error; noisy environment; psychoacoustics; recognition errors; robust speech recognition; Automatic speech recognition; Ear; Feature extraction; Filters; Hidden Markov models; Mathematical model; Psychoacoustic models; Robustness; Speech recognition; Working environment noise;

fLanguage :

English

Journal_Title :

Speech and Audio Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1063-6676

Type :

jour

DOI :

10.1109/TSA.2003.819951

Filename :

1261269

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=863235