Title :
Effective Proximity Retrieval by Ordering Permutations
Author :
Gonzalez, E.C. ; Figueroa, K. ; Navarro, G.
Author_Institution :
Fac. de Cienc., Univ. Michoacana, Morella
Abstract :
We introduce a new probabilistic proximity search algorithm for range and A"-nearest neighbor (A"-NN) searching in both coordinate and metric spaces. Although there exist solutions for these problems, they boil down to a linear scan when the space is intrinsically high dimensional, as is the case in many pattern recognition tasks. This, for example, renders the A"-NN approach to classification rather slow in large databases. Our novel idea is to predict closeness between elements according to how they order their distances toward a distinguished set of anchor objects. Each element in the space sorts the anchor objects from closest to farthest to it and the similarity between orders turns out to be an excellent predictor of the closeness between the corresponding elements. We present extensive experiments comparing our method against state-of-the-art exact and approximate techniques, both in synthetic and real, metric and nonmetric databases, measuring both CPU time and distance computations. The experiments demonstrate that our technique almost always improves upon the performance of alternative techniques, in some cases by a wide margin.
Keywords :
database indexing; information retrieval; pattern recognition; search problems; very large databases; large databases; nearest neighbor searching; ordering permutations; pattern recognition tasks; probabilistic proximity search algorithm; proximity retrieval; Computer Society; Databases; Extraterrestrial measurements; Feature extraction; Information retrieval; Neural networks; Pattern recognition; Sequences; Support vector machine classification; Support vector machines; Data Storage Representations; Data Structures; Implementation; Indexing methods; Information Search and Retrieval; Information Storage and Retrieval; Algorithms; Artificial Intelligence; Computer Simulation; Data Interpretation, Statistical; Image Enhancement; Image Interpretation, Computer-Assisted; Information Storage and Retrieval; Models, Statistical; Pattern Recognition, Automated; Subtraction Technique;
Journal_Title :
Pattern Analysis and Machine Intelligence, IEEE Transactions on
DOI :
10.1109/TPAMI.2007.70815