DocumentCode
3292541
Title
An Empirical Evaluation of a Distributed Clustering-Based Index for Metric Space Databases
Author
Gil-Costa, Veronica ; Marin, Mauricio ; Reyes, Nora
Author_Institution
Univ. Nac. de San Luis, San Luis
fYear
2008
fDate
11-12 April 2008
Firstpage
95
Lastpage
102
Abstract
Similarity search has been proved suitable for searching in very large collections of unstructured data objects. We are interested in efficient parallel query processing under situations of continuous streams of queries as in search engines. A number of sequential index data structures for this purpose have been proposed so far. This paper focuses on one representative of a class of these data structures, namely one based on clustering for which we evaluate different ways of distributing the index to support parallelism on a set of processors. Our study reveals that the intuitive method for both data distribution and model of computing are not efficient in practice. The best results are obtained with a strategy that appears to be more costly in construction but we show that in practice this cost is not significant.
Keywords
client-server systems; data structures; parallel programming; query processing; data distribution; distributed clustering based index; index data structures; metric space databases; parallel query processing; search engines; similarity search; Data structures; Distributed computing; Distributed databases; Extraterrestrial measurements; Indexes; Nearest neighbor searches; Parallel processing; Query processing; Search engines; Traffic control; BSP; Data Structures; Metric Space; Parallel Search;
fLanguage
English
Publisher
ieee
Conference_Titel
Similarity Search and Applications, 2008. SISAP 2008. First International Workshop on
Conference_Location
Belfast
Print_ISBN
0-7695-3101-6
Type
conf
DOI
10.1109/SISAP.2008.14
Filename
4492930
Link To Document