Title :
Data-driven rescaled Teager energy cepstral coefficients for noise-robust speech recognition
Author :
Miau-Luan Hsu ; Chia-Ping Chen
Author_Institution :
Dept. of Comput. Sci. & Eng., Nat. Sun Yat-sen Univ., Kaohsiung, Taiwan
Abstract :
We investigate data-driven rescaled Teager energy cepstral coefficients (DRTECC) features for noise-robust speech recognition. In the first stage, we apply a bank of auditory gammatone filters (GTF) and extract Teager-Kaiser energy (TE) estimates, which substitute the commonly used mel-spectrum. The output features of the first stage are called the Teager energy cepstral coefficients (TECC). In the second stage, we apply a piecewise rescaling operation of the cepstral coefficients of the zeroth order to bridge the difference between clean and noisy utterances. The segmentation point is determined by voice activity detection (VAD), and the proportional constants are data-driven. The resultant features are called DRTECC. The proposed features are evaluated on the Aurora 2.0 database. The relative improvements over the baseline MFCC features are significant.
Keywords :
audio databases; cepstral analysis; channel bank filters; feature extraction; speech recognition; Aurora 2.0 database; DRTECC; GTF; VAD; auditory gammatone filters; data driven rescaled Teager energy cepstral coefficients; extract Teager-Kaiser energy estimates; noise robust speech recognition; piecewise rescaling operation; segmentation point; voice activity detection; Feature extraction; Mel frequency cepstral coefficient; Noise; Noise measurement; Speech; Speech recognition; Teager energy estimation; energy rescale; gamma-tone filters; noise-robust speech recognition;
Conference_Titel :
Signal & Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012 Asia-Pacific
Conference_Location :
Hollywood, CA
Print_ISBN :
978-1-4673-4863-8