DocumentCode
3143528
Title
Interval-based pruning for top-k processing over compressed lists
Author
Chakrabarti, Kaushik ; Chaudhuri, Surajit ; Ganti, Venkatesh
fYear
2011
fDate
11-16 April 2011
Firstpage
709
Lastpage
720
Abstract
Optimizing execution of top-k queries over record-id ordered, compressed lists is challenging. The threshold family of algorithms cannot be effectively used in such cases. Yet, improving execution of such queries is of great value. For example, top-k keyword search in information retrieval (IR) engines represents an important scenario where such optimization can be directly beneficial. In this paper, we develop novel algorithms to improve execution of such queries over state of the art techniques. Our main insights are pruning based on fine-granularity bounds and traversing the lists based on judiciously chosen “intervals” rather than individual records. We formally study the optimality characteristics of the proposed algorithms. Our algorithms require minimal changes and can be easily integrated into IR engines. Our experiments on real-life datasets show that our algorithm outperform the state of the art techniques by a factor of 3-6 in terms of query execution times.
Keywords
query processing; compressed lists; information retrieval engines; interval-based pruning; query execution times; top-k keyword search; top-k processing; top-k queries; Engines; Indexes; Keyword search; Organizations; Partitioning algorithms; Query processing; Upper bound;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Engineering (ICDE), 2011 IEEE 27th International Conference on
Conference_Location
Hannover
ISSN
1063-6382
Print_ISBN
978-1-4244-8959-6
Electronic_ISBN
1063-6382
Type
conf
DOI
10.1109/ICDE.2011.5767855
Filename
5767855
Link To Document