• DocumentCode
    3143528
  • Title

    Interval-based pruning for top-k processing over compressed lists

  • Author

    Chakrabarti, Kaushik ; Chaudhuri, Surajit ; Ganti, Venkatesh

  • fYear
    2011
  • fDate
    11-16 April 2011
  • Firstpage
    709
  • Lastpage
    720
  • Abstract
    Optimizing execution of top-k queries over record-id ordered, compressed lists is challenging. The threshold family of algorithms cannot be effectively used in such cases. Yet, improving execution of such queries is of great value. For example, top-k keyword search in information retrieval (IR) engines represents an important scenario where such optimization can be directly beneficial. In this paper, we develop novel algorithms to improve execution of such queries over state of the art techniques. Our main insights are pruning based on fine-granularity bounds and traversing the lists based on judiciously chosen “intervals” rather than individual records. We formally study the optimality characteristics of the proposed algorithms. Our algorithms require minimal changes and can be easily integrated into IR engines. Our experiments on real-life datasets show that our algorithm outperform the state of the art techniques by a factor of 3-6 in terms of query execution times.
  • Keywords
    query processing; compressed lists; information retrieval engines; interval-based pruning; query execution times; top-k keyword search; top-k processing; top-k queries; Engines; Indexes; Keyword search; Organizations; Partitioning algorithms; Query processing; Upper bound;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering (ICDE), 2011 IEEE 27th International Conference on
  • Conference_Location
    Hannover
  • ISSN
    1063-6382
  • Print_ISBN
    978-1-4244-8959-6
  • Electronic_ISBN
    1063-6382
  • Type

    conf

  • DOI
    10.1109/ICDE.2011.5767855
  • Filename
    5767855