DocumentCode
1960044
Title
Merging Results from Overlapping Databases in Distributed Information Retrieval
Author
Shengli Wu ; Jieyu Li
Author_Institution
Sch. of Comput. Sci. & Telecommun. Eng., Jiangsu Univ., Zhenjiang, China
fYear
2013
fDate
Feb. 27 2013-March 1 2013
Firstpage
102
Lastpage
107
Abstract
In this paper, we investigate the problem of results merging in distributed information retrieval when overlapping databases are used. We focus on two issues: score normalization and weights assignment for each of the component results. Empirical study with the TREC data has the following three findings: 1. The cubic regression model and logistic regression model are better than the commonly used zero-one score normalization method, 2. The weighting scheme of uneven similarity is an effective method of weights assignment. 3. Score normalization and weights assignment can be used separately or together in a results merging method to improve effectiveness. The findings obtained in this paper are very useful for effectiveness improvement when implementing a distributed information retrieval system.
Keywords
distributed databases; information retrieval; logistics; merging; regression analysis; TREC data; cubic regression model; distributed information retrieval system; logistic regression model; overlapping databases; score normalization; weighting scheme; weights assignment; zero-one score normalization method; Distributed databases; Information retrieval; Logistics; Mathematical model; Merging; Servers; distributed information retrieval; overlapping databases; results merging; score normalization; weights assignment;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel, Distributed and Network-Based Processing (PDP), 2013 21st Euromicro International Conference on
Conference_Location
Belfast
ISSN
1066-6192
Print_ISBN
978-1-4673-5321-2
Electronic_ISBN
1066-6192
Type
conf
DOI
10.1109/PDP.2013.22
Filename
6498539
Link To Document