Title :
Speech analysis and recognition using interval statistics generated from a composite auditory model
Author :
Sheikhzadeh, H. ; Deng, L.
Author_Institution :
Dept. of Electr. & Comput. Eng., Waterloo Univ., Ont., Canada
fDate :
1/1/1998 12:00:00 AM
Abstract :
A modeling approach to auditory speech analysis and recognition is proposed and evaluated, where a composite auditory model is used to generate parallel sets of auditory-nerve instantaneous firing rates (IFRs) along the spatial dimension, followed by a processing stage that constructs from the IFRs the interval statistics in a form called the interpeak interval histogram (IPIH). A speech preprocessor is designed that performs transformation on the auditory IPIHs and interfaces the IPIH-based auditory representation with a hidden Markov model-based (HMM-based) speech recognizer. The results demonstrate that the new preprocessor consistently outperforms the conventional mel frequency cepstral coefficient-based (MFCC-based) preprocessor for the signal-to-noise ratio (SNR) level up to at least 16 dB
Keywords :
acoustic signal processing; hearing; hidden Markov models; speech processing; speech recognition; statistical analysis; HMM-based speech recognizer; SNR; auditory representation; auditory speech analysis; auditory-nerve instantaneous firing rates; composite auditory model; hidden Markov model; interpeak interval histogram; interval statistics; mel frequency cepstral coefficient; processing stage; signal-to-noise ratio; spatial dimension; speech preprocessor; Acoustic beams; Acoustic signal processing; Computational complexity; Hidden Markov models; Natural languages; Neural networks; Pattern recognition; Speech analysis; Speech processing; Speech recognition;
Journal_Title :
Speech and Audio Processing, IEEE Transactions on