DocumentCode :
4693
Title :
FastRAQ: A Fast Approach to Range-Aggregate Queries in Big Data Environments
Author :
Xiaochun Yun ; Guangjun Wu ; Guangyan Zhang ; Keqin Li ; Shupeng Wang
Author_Institution :
Inst. of Inf. Eng., Beijing, China
Volume :
3
Issue :
2
fYear :
2015
fDate :
April-June 1 2015
Firstpage :
206
Lastpage :
218
Abstract :
Range-aggregate queries are to apply a certain aggregate function on all tuples within given query ranges. Existing approaches to range-aggregate queries are insufficient to quickly provide accurate results in big data environments. In this paper, we propose FastRAQ-a fast approach to range-aggregate queries in big data environments. FastRAQ first divides big data into independent partitions with a balanced partitioning algorithm, and then generates a local estimation sketch for each partition. When a range-aggregate query request arrives, FastRAQ obtains the result directly by summarizing local estimates from all partitions. FastRAQ has O(1) time complexity for data updates and O(N/P×B) time complexity for range-aggregate queries, where N is the number of distinct tuples for all dimensions, P is the partition number, and B is the bucket number in the histogram. We implement the FastRAQ approach on the Linux platform, and evaluate its performance with about 10 billions data records. Experimental results demonstrate that FastRAQ provides range-aggregate query results within a time period two orders of magnitude lower than that of Hive, while the relative error is less than 3 percent within the given confidence interval.
Keywords :
Big Data; Linux; computational complexity; query processing; FastRAQ; Linux platform; balanced partitioning algorithm; big data environments; range-aggregate queries; time complexity; Aggregates; Big data; Histograms; Indexes; Partitioning algorithms; Servers; Vectors; Balanced partition; big data; multidimensional histogram; range-aggregate query;
fLanguage :
English
Journal_Title :
Cloud Computing, IEEE Transactions on
Publisher :
ieee
ISSN :
2168-7161
Type :
jour
DOI :
10.1109/TCC.2014.2338325
Filename :
6868272
Link To Document :
بازگشت