Title :
Parallel Aggregation Queries over Star Schema: A Hierarchical Encoding Scheme and Efficient Percentile Computing as a Case
Author :
Qin, Xiongpai ; Wang, Huiju ; Du, Xiaoyong ; Wang, Shan
Author_Institution :
Sch. of Inf., Renmin Univ. of China, Beijing, China
Abstract :
Big data analysis is a main challenge we meet recently. Cloud computing is attracting more and more big data analysis applications, due to its well scalability and fault-tolerance. Some aggregation functions, like SUM, can be computed in parallel, because they satisfy distributive law of addition. Unfortunately, some of statistical functions are not naturally parallelizable. That means they do not satisfy distributive law of addition. In this paper, we focus on percentile computing problem. We proposed an iterative-style prediction-based parallel algorithm in a distributed system. Prediction is done through a sampling technique. Experiment results verify the efficiency of our algorithm.
Keywords :
cloud computing; data analysis; fault tolerant computing; parallel algorithms; query processing; sampling methods; big data analysis; cloud computing; distributed system; efficient percentile computing; fault-tolerance; hierarchical encoding scheme; iterative-style prediction-based parallel algorithm; parallel aggregation queries; sampling technique; star schema; statistical functions; Algorithm design and analysis; Convergence; Encoding; Histograms; Indexes; Prediction algorithms; Query processing; Hierarchical Encoding; Iterative; Percentile;
Conference_Titel :
Parallel and Distributed Processing with Applications (ISPA), 2011 IEEE 9th International Symposium on
Conference_Location :
Busan
Print_ISBN :
978-1-4577-0391-1
Electronic_ISBN :
978-0-7695-4428-1
DOI :
10.1109/ISPA.2011.34