Title :
Spectral vs. spectro-temporal features for acoustic event detection
Author :
Cotton, Courtenay V. ; Ellis, Daniel P W
Author_Institution :
Dept. of Electr. Eng., Columbia Univ., New York, NY, USA
Abstract :
Automatic detection of different types of acoustic events is an interesting problem in soundtrack processing. Typical approaches to the problem use short-term spectral features to describe the audio signal, with additional modeling on top to take temporal context into account. We propose an approach to detecting and modeling acoustic events that directly describes temporal context, using convolutive non-negative matrix factorization (NMF). NMF is useful for finding parts-based decompositions of data; here it is used to discover a set of spectro-temporal patch bases that best describe the data, with the patches corresponding to event-like structures. We derive features from the activations of these patch bases, and perform event detection on a database consisting of 16 classes of meeting-room acoustic events. We compare our approach with a baseline using standard short-term mel frequency cepstal coefficient (MFCC) features. We demonstrate that the event-based system is more robust in the presence of added noise than the MFCC-based system, and that a combination of the two systems performs even better than either individually.
Keywords :
acoustic convolution; audio signal processing; matrix decomposition; MFCC-based system; NMF; acoustic event detection; acoustic signal processing; audio signal; automatic detection; convoluti NMF; mel frequency cepstal coefficient; non-negative matrix factorization; short-term spectral features; soundtrack processing; spectral features; spectro-temporal features; spectro-temporal patch; Event detection; Feature extraction; Hidden Markov models; Mel frequency cepstral coefficient; Noise; Signal processing algorithms; Acoustic signal processing; acoustic event classification; acoustic event detection; non-negative matrix factorization;
Conference_Titel :
Applications of Signal Processing to Audio and Acoustics (WASPAA), 2011 IEEE Workshop on
Conference_Location :
New Paltz, NY
Print_ISBN :
978-1-4577-0692-9
Electronic_ISBN :
1931-1168
DOI :
10.1109/ASPAA.2011.6082331