Title of article :
Partial Replica Selection Based on Relevance for Information Retrieval
Author/Authors :
Lu، Zhihong نويسنده , , McKinley، Kathryn S. نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 1999
Pages :
-96
From page :
97
To page :
0
Abstract :
Partial collection replication improves performance and scalability of a large-scale distributed information retrieval system by distributing excessive workloads, reducing network latency, and restricting some searches to a small percentage of data. In this paper, -we first examine queries from real system logs and show that there is sufficient query locality in real systems to justify partial collection replication. We then present a method for constructing a hierarchy of partial replicas from a collection where each replica is a subset of all larger replicas, and extend the inference network model to rank and select partial replicas. We compare our new selection algorithm to previous work on collection selection over a range of tuning parameters. For a given query, our replica selection algorithm, correctly determines the most relevant of the replicas or original collection, and thus maintains the highest retrieval effectiveness while searching the least data as compared with the other ranking functions. Simulation results show that with load balancing, partial replication consistently improves performance over collection partitioning on multiple disks of a shared-memory multiprocessor and it requires only modest query locality.
Keywords :
word boundary identification , logistic regression , Chinese text segmentation , multi-word terms
Journal title :
SIGIR FORUM
Serial Year :
1999
Journal title :
SIGIR FORUM
Record number :
16805
Link To Document :
بازگشت