DocumentCode
3360589
Title
A scheduling mechanism for multiple MapReduce jobs in a workflow application (position paper)
Author
Yoo, Dongjin ; Sim, Kwang Mong
Author_Institution
Multi-Agent & Cloud Comput. Syst. Lab., Gwangju Inst. of Sci. & Technol. (GIST), Gwangju, South Korea
fYear
2012
fDate
11-13 Jan. 2012
Firstpage
405
Lastpage
410
Abstract
MapReduce is currently an attractive model for data intensive application due to easy interface of programming, high scalability and fault tolerance capability. It is well suited for applications requiring processing large data with distributed processing resources such as web data analysis, bio informatics, and high performance computing area. There are many studies of job scheduling mechanism in shared cluster for MapReduce. However there is a need for scheduling workflow service composed of multiple MapReduce tasks with precedence dependency in multiple processing nodes. The contribution of this paper is proposing a scheduling mechanism for a workflow service containing multiple MapReduce jobs. The workflow application has precedence dependency constraints among multiple tasks, represented as directed acyclic graph (DAG). Also, for less data transfer cost in limited bisection bandwidth, data dependency criterion should be considered for scheduling multiple map-reduce jobs in a workflow. The proposed scheduling mechanism provides 1) scheduling MapReduce tasks regarding precedence constraints and 2) pre-data placement method considering data dependency constraints for saving data transfer cost over network.
Keywords
directed graphs; distributed processing; scheduling; software fault tolerance; Web data analysis; bio informatics; data intensive application; data transfer cost; directed acyclic graph; fault tolerance capability; high performance computing; high scalability; multiple MapReduce jobs; scheduling mechanism; workflow application; workflow service scheduling; Cloud computing; Computational modeling; Distributed databases; Fault tolerance; Fault tolerant systems; Processor scheduling; Synchronization; Cloud Computing; Data Intensive Computing; MapReduce; Scheduling; Workflow Application;
fLanguage
English
Publisher
ieee
Conference_Titel
Computing, Communications and Applications Conference (ComComAp), 2012
Conference_Location
Hong Kong
Print_ISBN
978-1-4577-1717-8
Type
conf
DOI
10.1109/ComComAp.2012.6154882
Filename
6154882
Link To Document