DocumentCode :
720560
Title :
FlexiMod: Flexible Coexistence Support for Programming Models
Author :
Luna Xu ; Butt, Ali R.
fYear :
2015
fDate :
4-7 May 2015
Firstpage :
773
Lastpage :
776
Abstract :
The rapid growth in big data is driving the development and evolution of numerous analytics frameworks optimized for the different needs of applications. Emerging big data applications comprise rich multi-faceted workflows with both compute-intensive and data-intensive tasks with intricate communication patterns. Thus, a single framework cannot support all application types and needs. For example, while the MapReduce model has proven to be effective for common data-intensive tasks with well-defined execution phases, the MPI programming model may be better suited for extracting high-performance for compute-intensive tasks and handling arbitrary communication patterns. Researchers have recognized this need to employ specialized models for different phases of a workflow, e.g., performing computations using MPI followed by visualizations using MapReduce. As a result, compromises have to be made either to use multi-cluster approaches that entail large data movement across clusters, or to sacrifice some aspects of the applications, e.g., using MapReduce solely with a higher communication overhead. Consequently, there is a crucial need for supporting coexisting disparate programming models on the same set of resources that are managed in a holistic manner. The objective of this research is to provide an efficient solution for the above problem by designing FLEXIMOD, a holistic approach for supporting coexistence of multiple programming models. The envisioned solution includes a user-friendly workflow generation tool, a runtime environment that feeds the different tasks to different programming frameworks and transparently executes the workflow, and an underlying scheduling system that co-ordinates and co-host different frameworks in the same set of resources under a multi-tenant environment. Our pilot project, GERBIL, a framework for co-hosting unmodified MPI applications alongside MapReduce applications on top of YARN. GERBIL bridges the fundamental mismatch between YARN- and MPI by designing an MPI-aware resource allocation mechanism. Our initial evaluation shows that GERBIL enables MPI executions with performance comparable to a native MPI setup, and improve compute-intensive application performance by up to 133% when compared to corresponding MapReduce version of the applications.
Keywords :
Big Data; data analysis; message passing; pattern clustering; resource allocation; FlexiMod; GERBIL; MPI programming model; MPI-aware resource allocation mechanism; MapReduce applications; MapReduce model; YARN; analytics frameworks; arbitrary communication patterns; big data applications; compute-intensive application performance; compute-intensive tasks; data movement; data-intensive tasks; disparate programming models; execution phases; flexible coexistence support; intricate communication patterns; multicluster approaches; programming frameworks; programming models; runtime environment; unmodified MPI applications; user-friendly workflow generation tool; Computational modeling; Containers; Fault tolerance; Fault tolerant systems; Programming; Resource management; Yarn;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster, Cloud and Grid Computing (CCGrid), 2015 15th IEEE/ACM International Symposium on
Conference_Location :
Shenzhen
Type :
conf
DOI :
10.1109/CCGrid.2015.173
Filename :
7152554
Link To Document :
بازگشت