Title of article :
Evaluating subtopic retrieval methods: Clustering versus diversification of search results
Author/Authors :
Claudio Carpineto، نويسنده , , Massimiliano D’Amico، نويسنده , , Giovanni Romano، نويسنده ,
Issue Information :
دوماهنامه با شماره پیاپی سال 2012
Pages :
16
From page :
358
To page :
373
Abstract :
To address the inability of current ranking systems to support subtopic retrieval, two main post-processing techniques of search results have been investigated: clustering and diversification. In this paper we present a comparative study of their performance, using a set of complementary evaluation measures that can be applied to both partitions and ranked lists, and two specialized test collections focusing on broad and ambiguous queries, respectively. The main finding of our experiments is that diversification of top hits is more useful for quick coverage of distinct subtopics whereas clustering is better for full retrieval of single subtopics, with a better balance in performance achieved through generating multiple subsets of diverse search results. We also found that there is little scope for improvement over the search engine baseline unless we are interested in strict full-subtopic retrieval, and that search results clustering methods do not perform well on queries with low divergence subtopics, mainly due to the difficulty of generating discriminative cluster labels.
Keywords :
diversification , Search results clustering , Search results diversification , Subtopic retrieval evaluation , Subtopic retrieval , Clustering , Search results re-ranking
Journal title :
Information Processing and Management
Serial Year :
2012
Journal title :
Information Processing and Management
Record number :
1229224
Link To Document :
بازگشت