DocumentCode
688245
Title
Optimizing Multi-join in Cloud Environment
Author
Yongcai Tao ; Mengxue Zhou ; Lei Shi ; Lin Wei ; Yangjie Cao
Author_Institution
Sch. of Inf. Eng., Zhengzhou Univ., Zhengzhou, China
fYear
2013
fDate
13-15 Nov. 2013
Firstpage
956
Lastpage
963
Abstract
In cloud computing, complex data analysis usually requires accessing multiple data sets. Existing MapReduce-based multi-join mechanism implements the join of multiple data sets by cascade method, which is flexible but poor efficiency. The paper analyzes existing concurrent join models and proposes a Two-Dimension Reducer matrix based Hierarchized Multi-Join model (TD-HMJ). TD-HMJ handles all the "key" attributes in one Map phase and divides the joined tables into several groups. Each group has three or two tables. In Reduce phase, the tables in each group can be joined at the same time by establishing a two-dimension Reducer matrix. TD-HMJ finishes the joining between groups through multiple Reduce processes. Theoretical analysis and experiment results show that TD-HMJ decreases the data transmission, curtails the time of multi-join, and increases the system efficiency.
Keywords
cloud computing; concurrency control; data analysis; optimisation; parallel processing; MapReduce-based multijoin mechanism; TD-HMJ; cloud computing; cloud environment; complex data analysis; concurrent join models; data transmission; multijoin optimization; two-dimension reducer matrix based hierarchized multijoin model; Analytical models; Cloud computing; Computational modeling; Data communication; Data models; Data processing; Educational institutions; cloud computing; data processing; hadoop; mapreduce; multi-join;
fLanguage
English
Publisher
ieee
Conference_Titel
High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC), 2013 IEEE 10th International Conference on
Conference_Location
Zhangjiajie
Type
conf
DOI
10.1109/HPCC.and.EUC.2013.136
Filename
6832018
Link To Document