• DocumentCode
    2405033
  • Title

    A sampling-based estimator for top-k selection query

  • Author

    Chen, Chung-Min ; Ling, Yibei

  • Author_Institution
    Telcordia Technol., Morristown, NJ, USA
  • fYear
    2002
  • fDate
    2002
  • Firstpage
    617
  • Lastpage
    627
  • Abstract
    Top-k queries arise naturally in many database applications that require searching for records whose attribute values are close to those specified in a query. We study the problem of processing a top-k query by translating it into an approximate range query that can be efficiently processed by traditional relational DBMSs. We propose a sampling-based approach, along with various query mapping strategies, to determine a range query that yields high recall with low access cost. Our experiments on real-world datasets show that, given the same memory budgets, our sampling-based estimator outperforms a previous histogram-based method in terms of access cost, while achieving the same level of recall. Furthermore, unlike the histogram-based approach, our sampling-based query mapping scheme scales well for high dimensional data and is easy to implement with low maintenance cost
  • Keywords
    database theory; query processing; relational databases; approximate range query; experiments; high recall; histogram-based method; low access cost; query mapping; relational database; sampling-based estimator; searching; top-k selection query; Data engineering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2002. Proceedings. 18th International Conference on
  • Conference_Location
    San Jose, CA
  • ISSN
    1063-6382
  • Print_ISBN
    0-7695-1531-2
  • Type

    conf

  • DOI
    10.1109/ICDE.2002.994779
  • Filename
    994779