DocumentCode :
3506793
Title :
A MapReduceMerge-based Data Cube Construction Method
Author :
Wang, Yuxiang ; Song, Aibo ; Luo, Junzhou
Author_Institution :
Sch. of Comput. Sci. & Eng., Southeast Univ., Nanjing, China
fYear :
2010
fDate :
1-5 Nov. 2010
Firstpage :
1
Lastpage :
6
Abstract :
The pre-computation of data cubes is critical to improve the response time of On-Line Analytical Processing (OLAP) system. However, as the size of data grows, the time it takes to construct data cubes becomes a significant performance bottleneck. Therefore, we need the parallel pre-computation approach to further improve the performance of OLAP. Current parallel approaches can be grouped into two categories: work partitioning and data partitioning. But the first one can not guarantee the load balance among processors and the second one produces massive data movement between processors. This paper proposes a MapReduceMerge-based parallel data cube construction method with a read-optimized data storage strategy which is more suitable for OLAP. Our method can ensure good load balancing and reduce the large amount of data movement compared with traditional approaches. MapReduceMerge is the expansion of Map Reduce which is a programming model that enables easy development of parallel applications to process massive data on large clusters and it is the key element of Hadoop(an cloud computing framework) which used to support the businesses of Face book under cloud environment. We modify the original MapReduceMerge framework to make it meet the needs of cuboids construction and show the implementation in details through an example of 2-dimension cuboids construction. In the mean time, we discuss the optimization for the construction of multi-dimension cuboids.
Keywords :
cloud computing; data mining; data warehouses; parallel processing; resource allocation; social networking (online); 2-dimension cuboids construction; Facebook; Hadoop; MapReduceMerge framework; OLAP; cloud computing framework; data partitioning; data warehouse; load balancing; multidimension cuboid construction; on-line analytical processing system; parallel application; parallel data cube construction method; parallel pre-computation approach; performance bottleneck; processor; programming model; read-optimized data storage strategy; work partitioning; MapReduceMerge; OLAP; data cube;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Grid and Cooperative Computing (GCC), 2010 9th International Conference on
Conference_Location :
Nanjing
Print_ISBN :
978-1-4244-9334-0
Electronic_ISBN :
978-0-7695-4313-0
Type :
conf
DOI :
10.1109/GCC.2010.14
Filename :
5662731
Link To Document :
بازگشت