Title :
Similarity Searching in Peer-to-Peer Databases
Author :
Bhattacharya, Indrajit ; Kashyap, Srinivas R. ; Parthasarathy, Srinivasan
Author_Institution :
Dept. of Comput. Sci., Maryland Univ., College Park, MD
Abstract :
We consider the problem of handling similarity queries in peer-to-peer databases. We propose an indexing and searching mechanism which, given a query object, returns the set of objects in the database that are semantically related to the query. We propose an indexing scheme which clusters data such that semantically related objects are partitioned into a small set of clusters, allowing for a simple and efficient similarity search strategy. Our indexing scheme also decouples object and node locations. Our adaptive replication and randomized lookup schemes exploit this feature and ensure that the number of copies of an object is proportional to its popularity and all replicas are equally likely to serve a given query, thus achieving perfect load balancing. The techniques developed in this work are oblivious to the underlying DHT topology and can be implemented on a variety of structured overlays such as CAN, CHORD, Pastry, and Tapestry. We also present DHT-independent analytical guarantees for the performance of our algorithms in terms of search accuracy, cost, and load-balance; the experimental results from our simulations confirm the insights derived from these analytical models
Keywords :
database indexing; peer-to-peer computing; query formulation; query processing; CAN; CHORD; DHT topology; Pastry; Tapestry; adaptive replication; data clustering; data indexing; load balancing; peer-to-peer databases; query object; randomized lookup; similarity query handling; similarity search strategy; similarity searching; Analytical models; Computer science; Databases; Educational institutions; Indexing; Information retrieval; Load management; Peer to peer computing; Performance analysis; Topology;
Conference_Titel :
Distributed Computing Systems, 2005. ICDCS 2005. Proceedings. 25th IEEE International Conference on
Conference_Location :
Columbus, OH
Print_ISBN :
0-7695-2331-5
DOI :
10.1109/ICDCS.2005.74