DocumentCode
704165
Title
Optimizing OLAP Cubes Construction by Improving Data Placement on Multi-nodes Clusters
Author
Arres, Billel ; Kabachi, Nadia ; Boussaid, Omar
Author_Institution
Univ. Lumiere Lyon 2, Bron, France
fYear
2015
fDate
4-6 March 2015
Firstpage
520
Lastpage
524
Abstract
The increasing volumes of relational data let us find an alternative to cope with them. The Hadoop framework - which is an open source project based on the MapReduce paradigm - is a popular choice for big data analytics. However, the performance gained from Hadoop´s features is currently limited by its default block placement policy, which does not take any data characteristics into account. Indeed, the efficiency of many operations can be improved by a careful data placement, including indexing, grouping, aggregation and joins. In this paper we propose a data warehouse placement policy to improve query gain performances on multi nodes clusters, especially Hadoop clusters. We investigate the performance gain for OLAP cube construction query with and without data organization. And this, by varying the number of nodes and data warehouse size. It has been found that, the proposed data placement policy has lowered global execution time for building OLAP data cubes up to 20 percent compared to default data placement.
Keywords
Big Data; data mining; data warehouses; parallel processing; public domain software; query processing; relational databases; Hadoop framework; MapReduce paradigm; OLAP cube construction query; OLAP cubes construction optimization; aggregation; big data analytics; data organization; data placement improvement; data warehouse placement policy; default block placement policy; grouping; indexing; joins; multinodes clusters; open source project; query gain performance improvement; relational data; Benchmark testing; Context; Data warehouses; Distributed databases; Indexing; Organizations; Warehousing; Block Placement; Data warehouses; HDFS; MapReduce;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel, Distributed and Network-Based Processing (PDP), 2015 23rd Euromicro International Conference on
Conference_Location
Turku
ISSN
1066-6192
Type
conf
DOI
10.1109/PDP.2015.45
Filename
7092769
Link To Document