Title :
Are you using the right approximate nearest neighbor algorithm?
Author :
O´Hara, S. ; Draper, Bruce A.
Author_Institution :
Colorado State Univ., Fort Collins, CO, USA
Abstract :
Many computer vision tasks such as large-scale image retrieval and nearest-neighbor classification perform similarity searches using Approximate Nearest Neighbor (ANN) indexes. These applications rely on the quality of ANN retrieval for success. Popular indexing methods for ANN queries include forests of kd-trees (KDT) and hierarchical k-means (HKM). The dominance of these two methods has led to implementations in many popular computer vision tool kits, perhaps leading some to assume that ANN indexing is a solved problem. This paper introduces a new method, the Proximity Forest, which is based on a forest of randomized metric trees. We present evidence that forests of metric trees significantly outperform KDT and HKM when evaluated on real-world data, controlling for dimensionality, data set size, distance function, and the number of neighbors returned by the query. Randomized metric trees are simple to implement, are generic in regards to the chosen distance function, and can be applied to a wider class of data representations.
Keywords :
computer vision; image classification; image representation; image retrieval; indexing; learning (artificial intelligence); trees (mathematics); ANN index; HKM method; KDT method; approximate nearest neighbor algorithm; computer vision; data representation; data set size; dimensionality; distance function; hierarchical k-means method; image retrieval; indexing method; kd-trees method; nearest-neighbor classification; proximity forest method; randomized metric trees; Accuracy; Artificial neural networks; Computer vision; Indexing; Measurement; Vectors; Vegetation;
Conference_Titel :
Applications of Computer Vision (WACV), 2013 IEEE Workshop on
Conference_Location :
Tampa, FL
Print_ISBN :
978-1-4673-5053-2
Electronic_ISBN :
1550-5790
DOI :
10.1109/WACV.2013.6474993