Title :
Merging Results from Overlapping Databases in Distributed Information Retrieval
Author :
Shengli Wu ; Jieyu Li
Author_Institution :
Sch. of Comput. Sci. & Telecommun. Eng., Jiangsu Univ., Zhenjiang, China
fDate :
Feb. 27 2013-March 1 2013
Abstract :
In this paper, we investigate the problem of results merging in distributed information retrieval when overlapping databases are used. We focus on two issues: score normalization and weights assignment for each of the component results. Empirical study with the TREC data has the following three findings: 1. The cubic regression model and logistic regression model are better than the commonly used zero-one score normalization method, 2. The weighting scheme of uneven similarity is an effective method of weights assignment. 3. Score normalization and weights assignment can be used separately or together in a results merging method to improve effectiveness. The findings obtained in this paper are very useful for effectiveness improvement when implementing a distributed information retrieval system.
Keywords :
distributed databases; information retrieval; logistics; merging; regression analysis; TREC data; cubic regression model; distributed information retrieval system; logistic regression model; overlapping databases; score normalization; weighting scheme; weights assignment; zero-one score normalization method; Distributed databases; Information retrieval; Logistics; Mathematical model; Merging; Servers; distributed information retrieval; overlapping databases; results merging; score normalization; weights assignment;
Conference_Titel :
Parallel, Distributed and Network-Based Processing (PDP), 2013 21st Euromicro International Conference on
Conference_Location :
Belfast
Print_ISBN :
978-1-4673-5321-2
Electronic_ISBN :
1066-6192
DOI :
10.1109/PDP.2013.22