DocumentCode :
3198001
Title :
A Comparative Study on the Combination of Multiple Retrieval Systems
Author :
Chun-Yi Liu ; Chuan-Yi Tang ; Hsu, D. Frank
Author_Institution :
Dept. of Comput. Sci., Nat. Tsing Hua Univ., Hsinchu, Taiwan
fYear :
2012
fDate :
13-15 Dec. 2012
Firstpage :
169
Lastpage :
181
Abstract :
It is known that combining multiple information retrieval systems can improve the combined systems performance over the performance of individual systems in many cases. It has also been known in these cases that the performance improvement of the combined system is mainly due to: (a) performance of each of the individual systems, and (b) the diversity between individual systems. However, it remains a challenging problem to quantify these two conditions. In this paper, we investigate these issues using live TREC datasets, TREC 2-6 (1993-97). Six systems in each dataset are selected either by random choice or by precision. We then compare performance of combining these six systems selected by random v.s. by precision from each of these datasets. It is demonstrated that, in each of the live datasets, the sum of x + y for positive cases (performance of combination of A and B is better than or equal to the individual systems) is larger than for negative cases (other than positive cases), where x is the performance ratio Pl/Ph and y is the diversity (between A and B), both normalized to [0, 1]. In addition, it is also demonstrated that combinations of t systems, t = 2,3,4, 5 , and 6 overall on the 6 systems selected by precision performs better than on the 6 systems selected by random.
Keywords :
information retrieval; information retrieval system evaluation; random processes; TREC-2 dataset; TREC-3 dataset; TREC-4 dataset; TREC-5 dataset; TREC-6 dataset; combined system performance improvement; diversity normalization; individual systems; multiple information retrieval systems; negative cases; performance ratio normalization; positive cases; precision; random choice; Computers; Data integration; Diversity reception; Educational institutions; Electronic mail; Informatics; Information retrieval; cognitive diversity; information retrieval; rank combination; rank-score characteristic (RSC) function; score combination;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pervasive Systems, Algorithms and Networks (ISPAN), 2012 12th International Symposium on
Conference_Location :
San Marcos, TX
ISSN :
1087-4089
Print_ISBN :
978-1-4673-5064-8
Type :
conf
DOI :
10.1109/I-SPAN.2012.31
Filename :
6428822
Link To Document :
بازگشت