Title :
Ridge Regression based classifiers for large scale class imbalanced datasets
Author :
Arpit, D. ; Shuang Wu ; Natarajan, Prem ; Prasad, Ranga ; Natarajan, Prem
Author_Institution :
Speech, Language & Multimedia Bus. Unit, Raytheon BBN Technol., Cambridge, MA, USA
Abstract :
Large scale, class imbalanced data classification is a challenging task that occurs frequently in several computer vision tasks such as web video retrieval. A number of algorithms have been proposed in literature that approach this problem from different perspectives (e.g. Sampling, Cost-sensitive learning, Active learning). The challenge is two fold in this task - first the data imbalance causes many classification algorithms to learn trivial classifiers that declare all test examples to be from the majority class. Second, many algorithms do not scale to large dataset sizes. We address these two issues by using two different cost-sensitive versions of Ridge Regression as our binary classifiers. We demonstrate our approach for retrieving unstructured web videos from 10 events on the benchmark TRECVID MED 12 dataset containing ≈47000 videos. We empirically show that they perform at par with state-of-the-art support vector machine based classifiers using χ2 kernels while being 30 to 60 times faster.
Keywords :
Internet; computer vision; image classification; regression analysis; support vector machines; video retrieval; video signal processing; χ2 kernels; TRECVID MED dataset; Web video retrieval; binary classifiers; class imbalanced data classification; classification algorithms; computer vision tasks; large scale class imbalanced datasets; ridge regression; support vector machine; trivial classifiers; Joints; Kernel; Linear programming; Minimization; Support vector machines; Training; Vectors;
Conference_Titel :
Applications of Computer Vision (WACV), 2013 IEEE Workshop on
Conference_Location :
Tampa, FL
Print_ISBN :
978-1-4673-5053-2
Electronic_ISBN :
1550-5790
DOI :
10.1109/WACV.2013.6475028