DocumentCode :
2219800
Title :
Partition histograms based on moving averages
Author :
Wei, Zhang ; Zhang Wei
Author_Institution :
Sch. of Comput. Sci., Huazhong Univ. of Sci. & Technol., Wuhan, China
Volume :
6
fYear :
2010
fDate :
20-22 Aug. 2010
Abstract :
Data streams are data in infinite, continuous, arriving at fast rate and in large amount. Histograms have been widely used to capture attribute value distribution statistics for query optimizers. Recently, histograms have also been considered as a way to produce quick approximate answers to decision support queries. We are interested in an efficient algorithm for choosing the bucket boundaries in a way that either minimizes the estimation error for a given amount of space (number of buckets), or, minimizes the space needed for a given upper bound on the error. In this paper, we present algorithms for computing optimal bucket boundaries is optimal than dynamic programming algorithm. We introduce the concept of moving average to compute SSE (Sum Squared Error). This algorithm can help to split the bucket, which result in more quickly generating the histogram and less storage. Through experiments, we show that the MPA algorithm is better than DPA algorithm.
Keywords :
data handling; decision support systems; dynamic programming; query processing; attribute value distribution statistics; data streams; decision support queries; dynamic programming algorithm; moving averages; optimal bucket boundaries; partition histograms; query optimizers; sum squared error; SSE; bucket; data stream; histograms; moving average;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Computer Theory and Engineering (ICACTE), 2010 3rd International Conference on
Conference_Location :
Chengdu
ISSN :
2154-7491
Print_ISBN :
978-1-4244-6539-2
Type :
conf
DOI :
10.1109/ICACTE.2010.5579202
Filename :
5579202
Link To Document :
بازگشت