Title :
An evaluation of content-based duplicate image detection methods for web search
Author :
Thomee, Bart ; Huiskes, Mark J. ; Bakker, Erwin M. ; Lew, Michael S.
Author_Institution :
Yahoo! Res., Barcelona, Spain
Abstract :
The world wide web is filled with billions of images and duplicates of images can frequently be found on many websites. These duplicates can be exact copies or differ slightly in their visual content. In this paper we provide a comparative study on how well content-based duplicate image detection methods are able to detect the duplicates of a query image. We conduct a survey to better understand in which ways such images on the internet differ from each other and use these observations to form a realistic and challenging duplicate image detection scenario. The methods we evaluate in our study are representative techniques from the research literature. In our evaluation, we target the performance of each method in relation to their descriptor size, description time and matching time, to assess their feasibility of application to large image collections (> 1 million).
Keywords :
Internet; image processing; image retrieval; Internet; Web search; Websites; World Wide Web; content-based duplicate image detection methods; large image collections; query image; Accuracy; Discrete wavelet transforms; Image color analysis; Image representation; Internet; Visualization; Content-based duplicate image detection; image redundancy; web search;
Conference_Titel :
Multimedia and Expo (ICME), 2013 IEEE International Conference on
Conference_Location :
San Jose, CA
DOI :
10.1109/ICME.2013.6607451