DocumentCode :
2492407
Title :
ParaCube: A Scalable OLAP Model Based on Distributed Aggregate Computing with Sibling Cubes
Author :
Zhang, Yansong ; Wang, Shan ; Huang, Wei
Author_Institution :
Key Lab. of the Minist. of Educ. for Data Eng. & Knowledge Eng., Beijing, China
fYear :
2010
fDate :
6-8 April 2010
Firstpage :
323
Lastpage :
329
Abstract :
The requirements of OLAP applications increase rapidly by dramatically increased data volume, users, query volume and query complexity. The requirement for shortening update period in data warehouse is another crucial factor for a scalable OLAP application. In this paper, we propose a scalable OLAP prototype to support the query processing with increasing data volume by distributing the whole fact tuples to multiple servers to construct a set of sibling cubes which can be merged together to obtain the whole cube. We employ a light weight distribution policy with fully duplicated dimension tables in each sibling server on the observation of very low proportion of space cost for dimension tables. OLAP query with distributed aggregate functions can be transformed into queries to be performed parallel in sibling servers. For non-distributed computing aggregate functions, such as median, the optimized median aggregate computing algorithm is proposed to reduce transmission volume between servers while computing the global median values. We also present a three-level framework in data warehouse to meet the requirement of shorter update period in "operational business intelligence". An asynchronous tunnel model is proposed to reduce update latency by pre-fetching updated tuples to OLAP processing server. Finally, we set up prototype system ParaCube to evaluate performance in SN (shared-nothing) system and multi-core platforms.
Keywords :
data mining; data warehouses; distributed processing; query processing; ParaCube; asynchronous tunnel model; data volume; data warehouse; distributed aggregate computing; multicore platforms; nondistributed computing aggregate functions; operational business intelligence; optimized median aggregate computing algorithm; query complexity; query processing; query volume; scalable OLAP model; shared-nothing system; sibling cubes; Acceleration; Aggregates; Application software; Concurrent computing; Data warehouses; Distributed computing; Material storage; Merging; Prototypes; Query processing; ParaCube; distributed aggregate; median; sibling cube;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Conference (APWEB), 2010 12th International Asia-Pacific
Conference_Location :
Busan
Print_ISBN :
978-1-7695-4012-2
Electronic_ISBN :
978-1-4244-6600-9
Type :
conf
DOI :
10.1109/APWeb.2010.31
Filename :
5474121
Link To Document :
بازگشت