DocumentCode :
3530876
Title :
Curse of Dimensionality in Pivot Based Indexes
Author :
Volnyansky, Ilya ; Pestov, Vladimir
Author_Institution :
Dept. of Math. & Stat., Univ. of Ottawa, Ottawa, ON, Canada
fYear :
2009
fDate :
29-30 Aug. 2009
Firstpage :
39
Lastpage :
46
Abstract :
We offer a theoretical validation of the curse of dimensionality in the pivot based indexing of datasets for similarity search, by proving, in the framework of statistical learning, that in high dimensions no pivot based indexing scheme can essentially outperform the linear scan. A study of the asymptotic performance of pivot based indexing schemes is performed on a sequence of datasets modeled as samples picked in i.i.d. fashion from a sequence of metric spaces. We allow the size of the dataset to grow in relation to dimension, such that the dimension is superlogarithmic but subpolynomial in the size of the dataset. The number of pivots is sublinear in the size of the dataset. We pick the least restrictive cost model of similarity search where we count each distance calculation as a single computation and disregard the rest. We demonstrate that if the intrinsic dimension of the spaces in the sense of concentration of measure phenomenon is linear in dimension, then the performance of similarity search pivot based indexes is asymptotically linear in the size of the dataset.
Keywords :
data structures; database indexing; statistical analysis; datasets sequence model; dimensionality curse validation; pivot based indexing; similarity search; statistical learning framework; Bismuth; Costs; Databases; Extraterrestrial measurements; Extraterrestrial phenomena; Indexing; Mathematics; Probability; Statistical learning; Statistics; Concentration of Measure; Curse of dimensionality; Data structures; Similarity search;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Similarity Search and Applications, 2009. SISAP '09. Second International Workshop on
Conference_Location :
Prague
Print_ISBN :
978-0-7695-3765-8
Type :
conf
DOI :
10.1109/SISAP.2009.9
Filename :
5271952
Link To Document :
بازگشت