Title :
Modeling the Performance of the Hadoop Online Prototype
Author :
Vianna, Emanuel ; Comarela, Giovanni ; Pontes, Tatiana ; Almeida, Jussara ; Almeida, Virgílio ; Wilkinson, Kevin ; Kuno, Harumi ; Dayal, Umeshwar
Author_Institution :
Dept. of Comput. Sci., Fed. Univ. of Minas Gerais (UFMG), Brazil
Abstract :
MapReduce is an important paradigm to support modern data-intensive applications. In this paper we address the challenge of modeling performance of one implementation of MapReduce called Hadoop Online Prototype (HOP), with a specific target on the intra-job pipeline parallelism. We use a hierarchical model that combines a precedence model and a queuing network model to capture the intra-job synchronization constraints. We first show how to build a precedence graph that represents the dependencies among multiple tasks of the same job. We then apply it jointly with an approximate Mean Value Analysis (aMVA) solution to predict mean job response time and resource utilization. We validate our solution against a queuing network simulator in various scenarios, finding that our performance model presents a close agreement, with maximum relative difference under 15%.
Keywords :
approximation theory; network theory (graphs); parallel processing; queueing theory; resource allocation; synchronisation; Hadoop online prototype; MapReduce; approximate mean value analysis; data-intensive application; intra-job pipeline parallelism; intra-job synchronization constraint; job response time; precedence graph; precedence model; queuing network model; resource utilization; Analytical models; Computational modeling; Delay; Parallel processing; Pipelines; Synchronization; Time factors; analytical model; hadoop online prototype; pipeline parallelism; queuing network; simulation; task graph;
Conference_Titel :
Computer Architecture and High Performance Computing (SBAC-PAD), 2011 23rd International Symposium on
Conference_Location :
Vitoria, Espirito Santo
Print_ISBN :
978-1-4577-2050-5
DOI :
10.1109/SBAC-PAD.2011.24