• DocumentCode
    3013318
  • Title

    Overcoming limitations of sampling for aggregation queries

  • Author

    Chaudhuri, Surajit ; Das, Gautam ; Datar, Mayur ; Motwani, Rajeev ; Narasayya, Vivek

  • Author_Institution
    Microsoft Corp., Redmond, WA, USA
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    534
  • Lastpage
    542
  • Abstract
    Studies the problem of approximately answering aggregation queries using sampling. We observe that uniform sampling performs poorly when the distribution of the aggregated attribute is skewed. To address this issue, we introduce a technique called outlier indexing. Uniform sampling is also ineffective for queries with low selectivity. We rely on weighted sampling based on workload information to overcome this shortcoming. We demonstrate that a combination of outlier indexing with weighted sampling can be used to answer aggregation queries with a significantly reduced approximation error compared to either uniform sampling or weighted sampling alone. We discuss the implementation of these techniques on Microsoft´s SQL Server and present experimental results that demonstrate the merits of our techniques
  • Keywords
    SQL; database indexing; file servers; query processing; relational databases; sampling methods; Microsoft SQL Server; aggregated attribute distribution; aggregation queries; approximate query answering; approximation error; outlier indexing; query selectivity; sampling limitations; skewed distribution; uniform sampling; weighted sampling; workload information; Aggregates; Computer science; Data mining; Query processing; Relational databases; Sampling methods;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2001. Proceedings. 17th International Conference on
  • Conference_Location
    Heidelberg
  • ISSN
    1063-6382
  • Print_ISBN
    0-7695-1001-9
  • Type

    conf

  • DOI
    10.1109/ICDE.2001.914867
  • Filename
    914867