DocumentCode :
2182749
Title :
Classifying soundtracks with audio texture features
Author :
Ellis, Daniel P W ; Zeng, Xiaohong ; McDermott, Josh H.
Author_Institution :
Dept. of Electr. Eng., Columbia Univ., New York, NY, USA
fYear :
2011
fDate :
22-27 May 2011
Firstpage :
5880
Lastpage :
5883
Abstract :
Sound textures may be defined as sounds whose character depends on statistical properties as much as the specific details of each individually-perceived event. Recent work has devised a set of statistics that, when synthetically imposed, allow listeners to identify a wide range of environmental sound textures. In this work, we investigate using these statistics for automatic classification of a set of environmental sound classes defined over a set of web videos depicting "multimedia events". We show that the texture statistics perform as well as our best conventional statistics (based on MFCC covariance). We further examine the relative contributions of the different statistics, showing the importance of modulation spectra and cross-band envelope correlations.
Keywords :
audio signal processing; audio texture features; cross-band envelope correlations; environmental sound classes; modulation spectra; soundtracks; Accuracy; Correlation; Mel frequency cepstral coefficient; Modulation; Speech; Support vector machines; Videos; Sound textures; environmental sound; soundtrack classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
ISSN :
1520-6149
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2011.5947699
Filename :
5947699
Link To Document :
بازگشت