DocumentCode :
3194625
Title :
Semantic concept detection in imbalanced datasets based on different under-sampling strategies
Author :
Guo, Jinlin ; Foley, Colum ; Gurrin, Cathal ; Lao, Songyang
Author_Institution :
School of Computing, Dublin City University, Ireland
fYear :
2011
fDate :
11-15 July 2011
Firstpage :
1
Lastpage :
6
Abstract :
Semantic concept detection is a very useful technique for developing powerful retrieval or filtering systems for multimedia data. To date, the methods for concept detection have been converging on generic classification schemes. However, there is often imbalanced dataset or rare class problems in classification algorithms, which deteriorate the performance of many classifiers. In this paper, we adopt three “under-sampling” strategies to handle this imbalanced dataset issue in a SVM classification framework and evaluate their performances on the TRECVid 2007 dataset and additional positive samples from TRECVid 2010 development set. Experimental results show that our well-designed “under-sampling” methods (method SAK) increase the performance of concept detection about 9:6% overall. In cases of extreme imbalance in the collection the proposed methods worsen the performance than a baseline sampling method (method SI), however in the majority of cases, our proposed methods increase the performance of concept detection substantially. We also conclude that method SAK is a promising solution to address the SVM classification with not extremely imbalanced datasets.
Keywords :
Classification; Imbalanced Dataset; SVM; TRECVid; Under-sampling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multimedia and Expo (ICME), 2011 IEEE International Conference on
Conference_Location :
Barcelona, Spain
ISSN :
1945-7871
Print_ISBN :
978-1-61284-348-3
Electronic_ISBN :
1945-7871
Type :
conf
DOI :
10.1109/ICME.2011.6011923
Filename :
6011923
Link To Document :
بازگشت