Title :
New Sampling-Based Estimators for OLAP Queries
Author :
Jin, Ruoming ; Glimcher, Leo ; Jermaine, Chris ; Agrawal, Gagan
Author_Institution :
Kent State University
Abstract :
One important way in which sampling for approximate query processing in a database environment differs from traditional applications of sampling is that in a database, it is feasible to collect accurate summary statistics from the data in addition to the sample. This paper describes a set of sampling-based estimators for approximate query processing that make use of simple summary statistics to to greatly increase the accuracy of sampling-based estimators. Our estimators are able to give tight probabilistic guarantees on estimation accuracy. They are suitable for low or high dimensional data, and work with categorical or numerical attributes. Furthermore, the information used by our estimators can easily be gathered in a single pass, making them suitable for use in a streaming environment.
Keywords :
Aggregates; Data analysis; Hardware; Histograms; Image databases; Query processing; Relational databases; Sampling methods; Sociology; Statistics;
Conference_Titel :
Data Engineering, 2006. ICDE '06. Proceedings of the 22nd International Conference on
Print_ISBN :
0-7695-2570-9
DOI :
10.1109/ICDE.2006.106