DocumentCode :
3564154
Title :
Feature-based noise robust speech recognition on an Indonesian language automatic speech recognition system
Author :
Satriawan, Cil Hardianto ; Lestari, Dessi Puji
Author_Institution :
Dept. of Electr. & Comput. Eng., Inst. Teknol. Bandung, Bandung, Indonesia
fYear :
2014
Firstpage :
42
Lastpage :
46
Abstract :
Mel-frequency Cesptral Coefficients (MFCC) and Predictive Linear Prediction (PLP) coefficients are two popular representations of continuous speech in existing Hidden Markov Model (HMM) based Automatic Speech Recognition (ASR) systems. Cepstral Mean Normalization (CMN) is often used as a post-processing step in the extraction of MFCC and PLP features to further enhance noise robustness at almost negligible computational cost. In this paper we build a closed dictionary, large vocabulary HMM-based Indonesian language ASR system using the CMU Sphinx in speech recognition toolkit implementing MFCC and PLP feature extraction, and CMN. We test the effect of various types and levels of noise on the word error rate (WER) of speech recognition. Utilizing CMN, an average improvement of 2% recognition over standard MFCC and PLP extraction methods is obtained at signal-to-noise ratios (SNR) below 24 decibels. A significant drop in recognition is observed between 12 and 6 dB SNR.
Keywords :
cepstral analysis; feature extraction; hidden Markov models; natural language processing; speech recognition; CMN; CMU Sphinx; HMM-based Indonesian language ASR system; Indonesian language automatic speech recognition system; MFCC feature extraction; Mel-frequency cesptral coefficients; PLP coefficients; PLP feature extraction; SNR; WER; cepstral mean normalization; continuous speech representations; feature-based noise robust speech recognition; hidden Markov model-based automatic speech recognition systems; predictive linear prediction coefficients; signal-to-noise ratios; speech recognition toolkit; word error rate; Feature extraction; Mel frequency cepstral coefficient; Noise robustness; Signal to noise ratio; Speech; Speech recognition; ASR; CMN; Indonesian language; MFCC; PLP;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Electrical Engineering and Computer Science (ICEECS), 2014 International Conference on
Print_ISBN :
978-1-4799-8477-0
Type :
conf
DOI :
10.1109/ICEECS.2014.7045217
Filename :
7045217
Link To Document :
بازگشت