DocumentCode :
1502995
Title :
Audio-Based Semantic Concept Classification for Consumer Video
Author :
Lee, Keansub ; Ellis, Daniel P W
Author_Institution :
Electr. Eng. Dept., Columbia Univ., New York, NY, USA
Volume :
18
Issue :
6
fYear :
2010
Firstpage :
1406
Lastpage :
1416
Abstract :
This paper presents a novel method for automatically classifying consumer video clips based on their soundtracks. We use a set of 25 overlapping semantic classes, chosen for their usefulness to users, viability of automatic detection and of annotator labeling, and sufficiency of representation in available video collections. A set of 1873 videos from real users has been annotated with these concepts. Starting with a basic representation of each video clip as a sequence of mel-frequency cepstral coefficient (MFCC) frames, we experiment with three clip-level representations: single Gaussian modeling, Gaussian mixture modeling, and probabilistic latent semantic analysis of a Gaussian component histogram. Using such summary features, we produce support vector machine (SVM) classifiers based on the Kullback-Leibler, Bhattacharyya, or Mahalanobis distance measures. Quantitative evaluation shows that our approaches are effective for detecting interesting concepts in a large collection of real-world consumer video clips.
Keywords :
Gaussian processes; audio signal processing; image classification; image representation; object detection; probability; support vector machines; Bhattacharyya distance measure; Gaussian component histogram; Gaussian mixture modeling; Kullback-Leibler distance measure; MFCC frame; Mahalanobis distance measure; SVM classifier; annotator labeling; audio-based semantic concept classification; clip-level representation; consumer video clip classification; mel-frequency cepstral coefficient; probabilistic latent semantic analysis; semantic concept detection; soundtrack; support vector machine; video clip representation; video collection; Audio classification; consumer video classification; semantic concept detection; soundtrack analysis;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2009.2034776
Filename :
5290083
Link To Document :
بازگشت