DocumentCode :
2572494
Title :
A Task-Based Fault-Tolerance Mechanism to Hierarchical Master/Worker with Divisible Tasks
Author :
Dai, Zhihui ; Viale, Fabien ; Chi, Xuebin ; Caromel, Denis ; Lu, Zhonghua
Author_Institution :
Comput. Network & Inf. Center, China Acad. of Sci., Beijing, China
fYear :
2009
fDate :
25-27 June 2009
Firstpage :
672
Lastpage :
677
Abstract :
The master/worker API of the ProActive middleware provides with an easy way to use framework for parallelizing embarrassingly parallel applications. However, the traditional master/worker model faces great challenges as the development of the scalability of the distributed computing. A single-layer hierarchical master/worker has been implemented as a solution to the scalability issues of the MW API. In the new framework, the mainmaster only communicates with some submasters, and each submaster manages a set of workers. A ldquobully election algorithmrdquo and an ldquoobject discovery mechanismrdquo are implemented to solve the fault-tolerance problems of the submasters. An automatic load-balancing mechanism is implemented for the hierarchical master/worker to solve divisible tasks. Moreover, an optimization has been done to make the fault-tolerance mechanism more efficient.
Keywords :
fault tolerant computing; middleware; parallel processing; resource allocation; API; ProActive middleware; automatic load-balancing mechanism; bully election algorithm; distributed computing scalability; divisible tasks; object discovery mechanism; parallel applications; single-layer hierarchical master-worker; task-based fault-tolerance mechanism; Computer networks; Distributed computing; Fault tolerance; High performance computing; Java; Libraries; Middleware; Nominations and elections; Parallel programming; Scalability; ProActive; divisible task; fault-tolerance; hierarchical master/worker; load-balancing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing and Communications, 2009. HPCC '09. 11th IEEE International Conference on
Conference_Location :
Seoul
Print_ISBN :
978-1-4244-4600-1
Electronic_ISBN :
978-0-7695-3738-2
Type :
conf
DOI :
10.1109/HPCC.2009.35
Filename :
5167062
Link To Document :
بازگشت