DocumentCode
1099992
Title
Environmental Sound Recognition With Time–Frequency Audio Features
Author
Chu, Selina ; Narayanan, Shrikanth ; Kuo, C. C Jay
Author_Institution
Dept. of Comput. Sci., Univ. of Southern California, Los Angeles, CA
Volume
17
Issue
6
fYear
2009
Firstpage
1142
Lastpage
1158
Abstract
The paper considers the task of recognizing environmental sounds for the understanding of a scene or context surrounding an audio sensor. A variety of features have been proposed for audio recognition, including the popular Mel-frequency cepstral coefficients (MFCCs) which describe the audio spectral shape. Environmental sounds, such as chirpings of insects and sounds of rain which are typically noise-like with a broad flat spectrum, may include strong temporal domain signatures. However, only few temporal-domain features have been developed to characterize such diverse audio signals previously. Here, we perform an empirical feature analysis for audio environment characterization and propose to use the matching pursuit (MP) algorithm to obtain effective time-frequency features. The MP-based method utilizes a dictionary of atoms for feature selection, resulting in a flexible, intuitive and physically interpretable set of features. The MP-based feature is adopted to supplement the MFCC features to yield higher recognition accuracy for environmental sounds. Extensive experiments are conducted to demonstrate the effectiveness of these joint features for unstructured environmental sound classification, including listening tests to study human recognition capabilities. Our recognition system has shown to produce comparable performance as human listeners.
Keywords
audio signal processing; pattern recognition; time-frequency analysis; Mel-frequency cepstral coefficients; audio sensor; broad flat spectrum; human recognition; matching pursuit algorithm; sound classification; sound recognition; temporal domain signatures; time-frequency audio features; Acoustic noise; Acoustic sensors; Cepstral analysis; Chirp; Humans; Insects; Layout; Matching pursuit algorithms; Rain; Spectral shape; Audio classification; Mel-frequency cepstral coefficient (MFCC); auditory scene recognition; data representation; feature extraction; feature selection; matching pursuit;
fLanguage
English
Journal_Title
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher
ieee
ISSN
1558-7916
Type
jour
DOI
10.1109/TASL.2009.2017438
Filename
5109766
Link To Document