• DocumentCode
    3143311
  • Title

    On query result diversification

  • Author

    Vieira, Marcos R. ; Razente, Humberto L. ; Barioni, Maria C N ; Hadjieleftheriou, Marios ; Srivastava, Divesh ; Traina, Caetano, Jr. ; Tsotras, Vassilis J.

  • Author_Institution
    Univ. of California, Riverside, CA, USA
  • fYear
    2011
  • fDate
    11-16 April 2011
  • Firstpage
    1163
  • Lastpage
    1174
  • Abstract
    In this paper we describe a general framework for evaluation and optimization of methods for diversifying query results. In these methods, an initial ranking candidate set produced by a query is used to construct a result set, where elements are ranked with respect to relevance and diversity features, i.e., the retrieved elements should be as relevant as possible to the query, and, at the same time, the result set should be as diverse as possible. While addressing relevance is relatively simple and has been heavily studied, diversity is a harder problem to solve. One major contribution of this paper is that, using the above framework, we adapt, implement and evaluate several existing methods for diversifying query results. We also propose two new approaches, namely the Greedy with Marginal Contribution (GMC) and the Greedy Randomized with Neighborhood Expansion (GNE) methods. Another major contribution of this paper is that we present the first thorough experimental evaluation of the various diversification techniques implemented in a common framework. We examine the methods´ performance with respect to precision, running time and quality of the result. Our experimental results show that while the proposed methods have higher running times, they achieve precision very close to the optimal, while also providing the best result quality. While GMC is deterministic, the randomized approach (GNE) can achieve better result quality if the user is willing to tradeoff running time.
  • Keywords
    query processing; greedy randomized with neighborhood expansion method; greedy with marginal contribution method; query result diversification; query result evaluation; query result optimization; Approximation algorithms; Approximation methods; Clustering algorithms; Dispersion; Force measurement; Optimization; Taxonomy;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering (ICDE), 2011 IEEE 27th International Conference on
  • Conference_Location
    Hannover
  • ISSN
    1063-6382
  • Print_ISBN
    978-1-4244-8959-6
  • Electronic_ISBN
    1063-6382
  • Type

    conf

  • DOI
    10.1109/ICDE.2011.5767846
  • Filename
    5767846