Title :
Comparison of feature extraction methods for speech recognition in noise-free and in traffic noise environment
Author :
Sárosi, Gellért ; Mozsáry, Mihály ; Mihajlik, Péter ; Fegyó, Tibor
Author_Institution :
Dept. of Telecommun. & Media Inf., Budapest Univ. of Technol. & Econ., Budapest, Hungary
Abstract :
A crucial part of a speech recognizer is the acoustic feature extraction, especially when the application is intended to be used in noisy environment. In this paper we investigate several novel front-end techniques and compare them to multiple baselines. Recognition tests were performed on studio quality wide band recordings on Hungarian as well as on narrow band telephone speech including real-life noises collected in six languages: English, German, French, Italian, Spanish and Hungarian. The following baseline feature types were used with several settings: Mel Frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction (PLP) features implemented in HTK, SPHINX, or by ourselves. Novel methods include Perceptual Minimum Variance Distortionless Response (PMVDR) and multiple variations of the Power-Normalized Cepstral Coefficients (PNCC). Also, adaptive techniques are applied to reduce convolutive distortions. We have experienced a significant difference between the MFCC implementations, and there were major differences in the PNCC variations useful in the different bandwidths and noise conditions.
Keywords :
cepstral analysis; feature extraction; speech recognition; traffic; MFCC; Mel Frequency Cepstral Coefficient; acoustic feature extraction; front end technique; narrow band telephone speech; perceptual linear prediction; perceptual minimum variance distortionless response; speech recognition; studio quality; traffic noise; wide band recording; Databases; Feature extraction; Hidden Markov models; Mel frequency cepstral coefficient; Noise; Speech recognition; Training; feature extraction; multiple languages; multiple sample rates; real-life and white noise; varied SNR;
Conference_Titel :
Speech Technology and Human-Computer Dialogue (SpeD), 2011 6th Conference on
Conference_Location :
Brasov
Print_ISBN :
978-1-4577-0440-6
DOI :
10.1109/SPED.2011.5940729