Title :
Rankboost-Based Result Merging
Author :
Benjamin Ghansah;Shengli Wu;Nathaniel Ghansah
Author_Institution :
Sch. of Comput. Sci. &
Abstract :
The explosion of searchable text content especially on the web has rendered information to be distributed among many disjoint text information sources (Federated Search). How to merge the results returned by selected sources is a major problem of the Federated Search task. We study the problem of learning to rank a set of objects by combining various sources of ranking. The problem of merging search results arises in several domains, for example combining the results of different verticals and also Meta search applications. This paper presents a supervised learning solution to the result merging problem. Our approach combines multiple sources of evidence to inform the merging decision. We use the Rankboost Method, a boosting approach to machine learning which learns a function that merges results based on information that is readily available: i.e. the ranks, titles, summaries, URLs and click-through data, which are found in the results pages. We combine these evidences by treating result merging as a multiclass machine learning problem. By not downloading additional information such as the full document, we decrease processing cost in terms of bandwidth usage and latency. We compare our results against existing result merging methods which rely on evidence found only in ranked lists, Semi-Supervised Learning (SSL), Sample-Agglomerate Fitting Estimate (SAFE) and CORI. An extensive set of experiments demonstrates that our method is more effective than the baseline result-merging algorithm under a variety of conditions.
Keywords :
"Merging","Training","Classification algorithms","Training data","Feature extraction","Metasearch"
Conference_Titel :
Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing (CIT/IUCC/DASC/PICOM), 2015 IEEE International Conference on
DOI :
10.1109/CIT/IUCC/DASC/PICOM.2015.136