DocumentCode
246833
Title
Neural response based phoneme classification under noisy condition
Author
Alam, Md Shamsul ; Jassim, Wissam A. ; Zilany, Muhammad S. A.
Author_Institution
Dept. of Biomed. Eng., Univ. of Malaya, Kuala Lumpur, Malaysia
fYear
2014
fDate
1-4 Dec. 2014
Firstpage
175
Lastpage
179
Abstract
Human listeners are capable of recognizing speech in noisy environment, while most of the traditional speech recognition methods do not perform well in the presence of noise. Unlike traditional Mel-frequency cepstral coefficient (MFCC)-based method, this study proposes a phoneme classification technique using the neural responses of a physiologically-based computational model of the auditory periphery. Neurograms were constructed from the responses of the model auditory nerve to speech phonemes. The features of neurograms were used to train the recognition system using a Gaussian Mixture Model (GMM) classification technique. Performance was evaluated for different types of phonemes such as stops, fricatives and vowels from the TIMIT database for both under quiet and noisy conditions. Although performance of the proposed method is comparable with that of MFCC-based classifier in quiet condition, the neural response-based proposed method outperforms the traditional MFCC-based method under noisy conditions even with the use of less number of features in the proposed method. The proposed method could be used in the field of speech recognition such as speech to text application, especially under noisy conditions.
Keywords
Gaussian processes; acoustic noise; cepstral analysis; hearing; mixture models; speech recognition; GMM classification technique; Gaussian mixture model; MFCC-based classifier; MFCC-based method; Mel-frequency cepstral coefficient; TIMIT database; auditory nerve; auditory periphery; human listeners; neural response; neurograms; noisy condition; phoneme classification technique; physiologically-based computational model; recognition system; speech phonemes; speech recognition methods; Accuracy; Computational modeling; Noise; Noise measurement; Robustness; Speech; Speech recognition; GMM; MFCC; auditory nerve model; neurogram; phoneme classification;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Signal Processing and Communication Systems (ISPACS), 2014 International Symposium on
Conference_Location
Kuching
Type
conf
DOI
10.1109/ISPACS.2014.7024447
Filename
7024447
Link To Document