DocumentCode :
3296595
Title :
Grouplet-Based Distance Metric Learning for Video Concept Detection
Author :
Jiang, Wei ; Loui, Alexander C.
Author_Institution :
Corp. Res. & Eng., Eastman Kodak Co., Rochester, NY, USA
fYear :
2012
fDate :
9-13 July 2012
Firstpage :
753
Lastpage :
758
Abstract :
We investigate general concept detection in unconstrained videos. A distance metric learning algorithm is developed to use the information of the group let structure for improved detection. A group let is defined as a set of audio and/or visual code words that are grouped together according to their strong correlations in videos. By using the entire group lets as building elements, concepts can be more robustly detected than using discrete audio or visual code words. Compared with the traditional method of generating aggregated group let-based features for classification, our group let-based distance metric learning approach directly learns distances between data points, which better preserves the group let structure. Specifically, our algorithm uses an iterative quadratic programming formulation where the optimal distance metric can be effectively learned based on the large-margin nearest-neighbor setting. The framework is quite flexible, where various types of distances can be computed using individual group lets, and through the same distance metric learning algorithm the distances computed over individual group lets can be combined for final classification. We extensively evaluate our method over the large-scale Columbia Consumer Video set. Experiments demonstrate that our approach can achieve consistent and significant performance improvements.
Keywords :
audio-visual systems; image classification; iterative methods; quadratic programming; video signal processing; vocabulary; classification; discrete audio codewords; discrete visual codewords; distance metric learning algorithm; grouplet-based distance metric learning; grouplet-based feature generation; iterative quadratic programming formulation; large-margin nearest neighbor setting; large-scale Columbia Consumer Video set; optimal distance metric; unconstrained videos; video concept detection; Correlation; Feature extraction; Kernel; Measurement; Support vector machines; Training; Visualization; distance metric learning; grouplet; video concept classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multimedia and Expo (ICME), 2012 IEEE International Conference on
Conference_Location :
Melbourne, VIC
ISSN :
1945-7871
Print_ISBN :
978-1-4673-1659-0
Type :
conf
DOI :
10.1109/ICME.2012.123
Filename :
6298493
Link To Document :
بازگشت