Title :
Building Robust Concept Detectors from Clickthrough Data: A Study in the MSR-Bing Dataset
Author :
Sarafis, Ioannis ; Diou, Christos ; Delopoulos, Anastasios
Author_Institution :
Electr. & Comput. Eng. Dept., Aristotle Univ. of Thessaloniki, Thessaloniki, Greece
Abstract :
In this paper we extend our previous work on strategies for automatically constructing noise resilient SVM detectors from click through data for large scale concept-based image retrieval. First, search log data is used in conjunction with Information Retrieval (IR) models to score images with respect to each concept. The IR models evaluated in this work include Vector Space Models (VSM), BM25 and Language Models (LM). The scored images are then used to create training sets for SVM and appropriate sample weights for two SVM variants: the Fuzzy SVM (FSVM) and the Power SVM (PSVM). These SVM variants incorporate weights for each individual training sample and can therefore be used to model label uncertainty at the classifier level. Experiments on the MSR-Bing Image Retrieval Grand Challenge dataset (consisting of 1M images and 82.3M unique clicks) show that FSVM is the most robust SVM algorithm for handling label noise and that the highest performance is achieved with weights derived from VSM. These results extend our previous findings on the value of FSVM from professional image archives to large-scale general purpose search engines, and furthermore identify VSM as the most appropriate sample weighting model.
Keywords :
image retrieval; learning (artificial intelligence); search engines; support vector machines; BM25; FSVM; IR models; MSR-Bing image retrieval grand challenge dataset; PSVM; VSM; clickthrough data; fuzzy SVM; information retrieval models; language models; large scale concept-based image retrieval; model label uncertainty; noise resilient SVM detectors; power SVM; robust concept detectors; scored images; search engines; search log data; vector space models; weighting model; Detectors; Noise; Robustness; Search engines; Support vector machines; Training; Vectors; Fuzzy SVM; Power SVM; Support Vector Machine; clickthrough data; concept based image retrieval;
Conference_Titel :
Semantic and Social Media Adaptation and Personalization (SMAP), 2014 9th International Workshop on
Conference_Location :
Corfu
Print_ISBN :
978-1-4799-6813-8
DOI :
10.1109/SMAP.2014.22