DocumentCode :
2047646
Title :
RFOH: A New Fault Tolerant Job Scheduler in Grid Computing
Author :
Khanli, Leili Mohammad ; Far, Maryam Etminan ; Rahmani, Amir Masoud
Author_Institution :
Dept. of Comput. Sci., Tabriz Univ., Tabriz, Iran
Volume :
1
fYear :
2010
fDate :
19-21 March 2010
Firstpage :
422
Lastpage :
425
Abstract :
The goal of grid computing is to aggregate the power of widely distributed resources. Considering that the probability of failure is great in such systems, fault tolerance has become a crucial area in computational grid. In this paper, we propose a new strategy named RFOH for fault tolerant job scheduling in computational grid. This strategy maintains the history of fault occurrence of resources in Grid Information Server (GIS). Whenever a resource broker has jobs to schedule, it uses this information in Genetic Algorithm and finds a near optimal solution for the problem. Further, it increases the percentage of jobs executed within specified deadline. The experimental result shows that we can have a combination of user satisfaction and reliability. Using checkpoint techniques, the proposed strategy can make grid scheduling more reliable and efficient.
Keywords :
checkpointing; fault tolerance; genetic algorithms; grid computing; job shop scheduling; Grid scheduling; RFOH; checkpoint techniques; computational Grid; distributed resources; fault tolerant job scheduler; genetic algorithm; grid computing; grid information server; Application software; Biological cells; Distributed computing; Fault tolerance; Genetic algorithms; Grid computing; History; Power engineering and energy; Power engineering computing; Processor scheduling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Engineering and Applications (ICCEA), 2010 Second International Conference on
Conference_Location :
Bali Island
Print_ISBN :
978-1-4244-6079-3
Electronic_ISBN :
978-1-4244-6080-9
Type :
conf
DOI :
10.1109/ICCEA.2010.88
Filename :
5445793
Link To Document :
بازگشت