DocumentCode :
2405033
Title :
A sampling-based estimator for top-k selection query
Author :
Chen, Chung-Min ; Ling, Yibei
Author_Institution :
Telcordia Technol., Morristown, NJ, USA
fYear :
2002
fDate :
2002
Firstpage :
617
Lastpage :
627
Abstract :
Top-k queries arise naturally in many database applications that require searching for records whose attribute values are close to those specified in a query. We study the problem of processing a top-k query by translating it into an approximate range query that can be efficiently processed by traditional relational DBMSs. We propose a sampling-based approach, along with various query mapping strategies, to determine a range query that yields high recall with low access cost. Our experiments on real-world datasets show that, given the same memory budgets, our sampling-based estimator outperforms a previous histogram-based method in terms of access cost, while achieving the same level of recall. Furthermore, unlike the histogram-based approach, our sampling-based query mapping scheme scales well for high dimensional data and is easy to implement with low maintenance cost
Keywords :
database theory; query processing; relational databases; approximate range query; experiments; high recall; histogram-based method; low access cost; query mapping; relational database; sampling-based estimator; searching; top-k selection query; Data engineering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering, 2002. Proceedings. 18th International Conference on
Conference_Location :
San Jose, CA
ISSN :
1063-6382
Print_ISBN :
0-7695-1531-2
Type :
conf
DOI :
10.1109/ICDE.2002.994779
Filename :
994779
Link To Document :
بازگشت