DocumentCode :
3637095
Title :
Unsupervised discovery of co-occurrence in sparse high dimensional data
Author :
Ondřej Chum;Jiří Matas
Author_Institution :
CMP, Dept. of Cybernetics, Faculty of Elec. Eng., Czech Technical University in Prague
fYear :
2010
Firstpage :
3416
Lastpage :
3423
Abstract :
An efficient min-Hash based algorithm for discovery of dependencies in sparse high-dimensional data is presented. The dependencies are represented by sets of features co-occurring with high probability and are called co-ocsets. Sparse high dimensional descriptors, such as bag of words, have been proven very effective in the domain of image retrieval. To maintain high efficiency even for very large data collection, features are assumed independent. We show experimentally that co-ocsets are not rare, i.e. the independence assumption is often violated, and that they may ruin retrieval performance if present in the query image. Two methods for managing co-ocsets in such cases are proposed. Both methods significantly outperform the state-of-the-art in image retrieval, one is also significantly faster.
Keywords :
"Image retrieval","Image databases","Information retrieval","Visual databases","Frequency","Cybernetics","Vocabulary","Random variables","Visualization","Surface treatment"
Publisher :
ieee
Conference_Titel :
Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on
ISSN :
1063-6919
Print_ISBN :
978-1-4244-6984-0
Type :
conf
DOI :
10.1109/CVPR.2010.5539997
Filename :
5539997
Link To Document :
بازگشت