Title :
Optimizing Multi-join in Cloud Environment
Author :
Yongcai Tao ; Mengxue Zhou ; Lei Shi ; Lin Wei ; Yangjie Cao
Author_Institution :
Sch. of Inf. Eng., Zhengzhou Univ., Zhengzhou, China
Abstract :
In cloud computing, complex data analysis usually requires accessing multiple data sets. Existing MapReduce-based multi-join mechanism implements the join of multiple data sets by cascade method, which is flexible but poor efficiency. The paper analyzes existing concurrent join models and proposes a Two-Dimension Reducer matrix based Hierarchized Multi-Join model (TD-HMJ). TD-HMJ handles all the "key" attributes in one Map phase and divides the joined tables into several groups. Each group has three or two tables. In Reduce phase, the tables in each group can be joined at the same time by establishing a two-dimension Reducer matrix. TD-HMJ finishes the joining between groups through multiple Reduce processes. Theoretical analysis and experiment results show that TD-HMJ decreases the data transmission, curtails the time of multi-join, and increases the system efficiency.
Keywords :
cloud computing; concurrency control; data analysis; optimisation; parallel processing; MapReduce-based multijoin mechanism; TD-HMJ; cloud computing; cloud environment; complex data analysis; concurrent join models; data transmission; multijoin optimization; two-dimension reducer matrix based hierarchized multijoin model; Analytical models; Cloud computing; Computational modeling; Data communication; Data models; Data processing; Educational institutions; cloud computing; data processing; hadoop; mapreduce; multi-join;
Conference_Titel :
High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC), 2013 IEEE 10th International Conference on
Conference_Location :
Zhangjiajie
DOI :
10.1109/HPCC.and.EUC.2013.136