DocumentCode
720560
Title
FlexiMod: Flexible Coexistence Support for Programming Models
Author
Luna Xu ; Butt, Ali R.
fYear
2015
fDate
4-7 May 2015
Firstpage
773
Lastpage
776
Abstract
The rapid growth in big data is driving the development and evolution of numerous analytics frameworks optimized for the different needs of applications. Emerging big data applications comprise rich multi-faceted workflows with both compute-intensive and data-intensive tasks with intricate communication patterns. Thus, a single framework cannot support all application types and needs. For example, while the MapReduce model has proven to be effective for common data-intensive tasks with well-defined execution phases, the MPI programming model may be better suited for extracting high-performance for compute-intensive tasks and handling arbitrary communication patterns. Researchers have recognized this need to employ specialized models for different phases of a workflow, e.g., performing computations using MPI followed by visualizations using MapReduce. As a result, compromises have to be made either to use multi-cluster approaches that entail large data movement across clusters, or to sacrifice some aspects of the applications, e.g., using MapReduce solely with a higher communication overhead. Consequently, there is a crucial need for supporting coexisting disparate programming models on the same set of resources that are managed in a holistic manner. The objective of this research is to provide an efficient solution for the above problem by designing FLEXIMOD, a holistic approach for supporting coexistence of multiple programming models. The envisioned solution includes a user-friendly workflow generation tool, a runtime environment that feeds the different tasks to different programming frameworks and transparently executes the workflow, and an underlying scheduling system that co-ordinates and co-host different frameworks in the same set of resources under a multi-tenant environment. Our pilot project, GERBIL, a framework for co-hosting unmodified MPI applications alongside MapReduce applications on top of YARN. GERBIL bridges the fundamental mismatch between YARN- and MPI by designing an MPI-aware resource allocation mechanism. Our initial evaluation shows that GERBIL enables MPI executions with performance comparable to a native MPI setup, and improve compute-intensive application performance by up to 133% when compared to corresponding MapReduce version of the applications.
Keywords
Big Data; data analysis; message passing; pattern clustering; resource allocation; FlexiMod; GERBIL; MPI programming model; MPI-aware resource allocation mechanism; MapReduce applications; MapReduce model; YARN; analytics frameworks; arbitrary communication patterns; big data applications; compute-intensive application performance; compute-intensive tasks; data movement; data-intensive tasks; disparate programming models; execution phases; flexible coexistence support; intricate communication patterns; multicluster approaches; programming frameworks; programming models; runtime environment; unmodified MPI applications; user-friendly workflow generation tool; Computational modeling; Containers; Fault tolerance; Fault tolerant systems; Programming; Resource management; Yarn;
fLanguage
English
Publisher
ieee
Conference_Titel
Cluster, Cloud and Grid Computing (CCGrid), 2015 15th IEEE/ACM International Symposium on
Conference_Location
Shenzhen
Type
conf
DOI
10.1109/CCGrid.2015.173
Filename
7152554
Link To Document