DocumentCode :
787167
Title :
Clustering for approximate similarity search in high-dimensional spaces
Author :
Li, Chen ; Chang, Edward ; Garcia-Molina, Hector ; Wiederhold, Gio
Author_Institution :
Dept. of Comput. Sci., Stanford Univ., CA, USA
Volume :
14
Issue :
4
fYear :
2002
Firstpage :
792
Lastpage :
808
Abstract :
We present a clustering and indexing paradigm (called Clindex) for high-dimensional search spaces. The scheme is designed for approximate similarity searches, where one would like to find many of the data points near a target point, but where one can tolerate missing a few near points. For such searches, our scheme can find near points with high recall in very few IOs and perform significantly better than other approaches. Our scheme is based on finding clusters and, then, building a simple but efficient index for them. We analyze the trade-offs involved in clustering and building such an index structure, and present extensive experimental results
Keywords :
computational complexity; database indexing; pattern clustering; query processing; tree data structures; very large databases; visual databases; Clindex; approximate similarity search; clustering; experimental results; high recall; high-dimensional search spaces; image database; indexing; large databases; time complexity; tree-like index structures; Buildings; Clustering algorithms; Content based retrieval; Geometry; Image retrieval; Indexing; Information retrieval; Nearest neighbor searches; Object detection; Search engines;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2002.1019214
Filename :
1019214
Link To Document :
بازگشت