DocumentCode :
3011376
Title :
A fault-tolerance mechanism in grid
Author :
Liang, Jin ; WeiQin, Tong ; JianQuan, Tang ; Bo, Wang
Author_Institution :
Sch. of Comput. Eng. & Sci., Shanghai Univ., China
fYear :
2003
fDate :
21-24 Aug. 2003
Firstpage :
457
Lastpage :
461
Abstract :
Grid appears as an effective technology coupling geographically distributed resources for solving large-scale problems in the wide area network. Fault tolerance in grid system is a significant and complex issue to secure a stable and reliable performance. Until now, various techniques exist for detecting and correcting faults in distributed computing systems. Unfortunately, few energy focus on fault-tolerance in grid environment, especially with the emergence of OGSA. A new fault-tolerant mechanism is needed to detect and recover service faults and nodes crash. Based on our previous work on Java threads state capturing and existing mobile agent techniques, we put forward a fault-tolerant mechanism providing effective fault-handling and recovering methods.
Keywords :
Java; fault tolerant computing; grid computing; mobile agents; multi-threading; system recovery; wide area networks; Java thread; distributed computing system; fault tolerance; geographically distributed resource; grid system; mobile agent technique; service fault recovery; wide area network; Computer crashes; Distributed computing; Fault detection; Fault tolerance; Fault tolerant systems; Java; Large-scale systems; Mobile agents; Wide area networks; Yarn;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Industrial Informatics, 2003. INDIN 2003. Proceedings. IEEE International Conference on
Print_ISBN :
0-7803-8200-5
Type :
conf
DOI :
10.1109/INDIN.2003.1300379
Filename :
1300379
Link To Document :
بازگشت