Title :
MARLA: MapReduce for Heterogeneous Clusters
Author :
Fadika, Zacharia ; Dede, Elif ; Hartog, Jessica ; Govindaraju, Madhusudhan
Author_Institution :
Comput. Sci. Dept., Binghamton Univ., Binghamton, NY, USA
Abstract :
MapReduce has gradually become the framework of choice for "big data". The MapReduce model allows for efficient and swift processing of large scale data with a cluster of compute nodes. However, the efficiency here comes at a price. The performance of widely used MapReduce implementations such as Hadoop suffers in heterogeneous and load-imbalanced clusters. We show the disparity in performance between homogeneous and heterogeneous clusters in this paper to be high. Subsequently, we present MARLA, a MapReduce framework capable of performing well not only in homogeneous settings, but also when the cluster exhibits heterogeneous properties. We address the problems associated with existing MapReduce implementations affecting cluster heterogeneity, and subsequently present through MARLA the components and trade-offs necessary for better MapReduce performance in heterogeneous cluster and cloud environments. We quantify the performance gains exhibited by our approach against Apache Hadoop and MARIANE in data intensive and compute intensive applications.
Keywords :
cloud computing; pattern clustering; Apache Hadoop; MARIANE; MARLA; MapReduce model; cloud environments; cluster heterogeneity; compute intensive applications; compute node cluster; data intensive applications; heterogeneous clusters; homogeneous clusters; large scale data processing; load-imbalanced clusters; Cloud computing; Educational institutions; Fault tolerance; Fault tolerant systems; Load modeling; Random access memory; Runtime; HADOOP; MARIANE; MARLA; MapReduce;
Conference_Titel :
Cluster, Cloud and Grid Computing (CCGrid), 2012 12th IEEE/ACM International Symposium on
Conference_Location :
Ottawa, ON
Print_ISBN :
978-1-4673-1395-7
DOI :
10.1109/CCGrid.2012.135