DocumentCode :
1852926
Title :
Robust speech recognition under noisy environments using asymmetric tapers
Author :
Alam, Md Jahangir ; Kenny, Patrick ; Shaughnessy, Douglas O.
Author_Institution :
INRS-EMT, Univ. of Quebec, Montreal, QC, Canada
fYear :
2012
fDate :
27-31 Aug. 2012
Firstpage :
1638
Lastpage :
1642
Abstract :
This paper presents asymmetric taper (or window)-based robust Mel frequency cepstral coefficient (MFCC) feature extraction for automatic speech recognition (ASR). Commonly, MFCC features are computed from a symmetric Hamming-tapered direct-spectrum estimate. Symmetric tapers have linear phase and also imply longer time delay. In ASR systems, phase information is usually discarded as human speech perception is relatively insensitive to short-time phase distortion. So, any linearity constraint on phase can be removed without adverse effects. Use of asymmetric tapers, having better frequency response and shorter time delay, for MFCC feature extraction in speech recognition can lead to better recognition performance. Using our proposed method it is possible to introduce asymmetry in any symmetric taper by adjusting only one additional parameter, which controls the degree of asymmetry. Experimental results on the AURORA-2 corpus show that the proposed asymmetric tapers outperform the symmetric Hamming taper in terms of word accuracy both in clean and noisy environments.
Keywords :
feature extraction; speech recognition; ASR system; asymmetric tapers; automatic speech recognition; feature extraction; frequency response; human speech perception; linear phase; linearity constraint; noisy environments; phase information; robust mel frequency cepstral coefficient; robust speech recognition; symmetric Hamming tapered direct spectrum estimate; time delay; word accuracy; Accuracy; Feature extraction; Mel frequency cepstral coefficient; Noise; Speech; Speech recognition; Training; Asymmetric taper; Hilbert transform; double dynamic range; speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing Conference (EUSIPCO), 2012 Proceedings of the 20th European
Conference_Location :
Bucharest
ISSN :
2219-5491
Print_ISBN :
978-1-4673-1068-0
Type :
conf
Filename :
6334099
Link To Document :
بازگشت