DocumentCode :
3156975
Title :
Gradient-based acoustic features for speech recognition
Author :
Muroi, Takashi ; Takashima, Ryoichi ; Takiguchi, Tetsuya ; Ariki, Yasuo
Author_Institution :
Dept. of Computer Sci. & Syst. Eng., Kobe Univ., Kobe, Japan
fYear :
2009
fDate :
7-9 Jan. 2009
Firstpage :
445
Lastpage :
448
Abstract :
This paper proposes a novel feature extraction method for speech recognition based on gradient features on a 2D time-frequency matrix. Widely used MFCC features lack temporal dynamics. In addition, ¿MFCC is an indirect expression of temporal frequency changes. To extract the temporal dynamics more directly, we propose local gradient features in an area around a reference position. The gradient-based features were originally proposed as HOG (histograms of oriented gradients) and applied to human body detection in image recognition. In this paper, we expand the application to include gradient-based acoustic features in speech recognition. The novel acoustic features were evaluated on a word-speech recognition task, and the results showed a significant improvement for clean speech and even for noisy speech when coupled with MFCC.
Keywords :
acoustic signal processing; cepstral analysis; feature extraction; gradient methods; speech recognition; time-frequency analysis; MFCC features; feature extraction; gradient-based acoustic features; local gradient features; mel-frequency cepstrum coefficient; speech recognition; temporal dynamics; time-frequency matrix; word-speech recognition task; Acoustic applications; Acoustic signal detection; Feature extraction; Histograms; Humans; Image recognition; Mel frequency cepstral coefficient; Speech analysis; Speech recognition; Time frequency analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Signal Processing and Communication Systems, 2009. ISPACS 2009. International Symposium on
Conference_Location :
Kanazawa
Print_ISBN :
978-1-4244-5015-2
Electronic_ISBN :
978-1-4244-5016-9
Type :
conf
DOI :
10.1109/ISPACS.2009.5383805
Filename :
5383805
Link To Document :
بازگشت