Title :
An agent oriented proactive fault-tolerant framework for grid computing
Author :
Huda, Mohammad Tanvir ; Schmidt, Heinz W. ; Peake, Ian D.
Author_Institution :
Centre for Distributed Syst. & Software Eng., Monash Univ., Melbourne, Vic.
Abstract :
Because of computational grid heterogeneity, scale and complexity, faults become likely. Therefore, grid infrastructure must have mechanisms to deal with faults while also providing efficient and reliable services to its end users. Existing fault-tolerant approaches are inefficient because they are reactive and incomplete. They are reactive because they only deal with faults when they take place; they are incomplete because they only deal with certain types of faults. Proactive approaches increase efficiency by reducing the cost and time of operations and network resource usage by maintaining the state of executing applications and resuming operation when rescheduled. This paper presents an agent oriented, fault-tolerant grid framework where agents deal with individual faults proactively. Agents maintain information about hardware conditions, executing process memory consumption, available resources, network conditions and component mean time to failure. Based on this information and critical states, agent can improve the reliability and efficiency of grid services
Keywords :
fault tolerant computing; grid computing; multi-agent systems; resource allocation; agent oriented proactive fault-tolerant framework; computational grid heterogeneity; grid computing; grid infrastructure; grid service reliability; memory consumption; network conditions; network resource; resource availability; Costs; Distributed computing; Fault tolerance; Fault trees; Grid computing; Hardware; Maintenance; Pervasive computing; Random access memory; Software engineering;
Conference_Titel :
e-Science and Grid Computing, 2005. First International Conference on
Conference_Location :
Melbourne, Vic.
Print_ISBN :
0-7695-2448-6
DOI :
10.1109/E-SCIENCE.2005.15