• DocumentCode
    3011376
  • Title

    A fault-tolerance mechanism in grid

  • Author

    Liang, Jin ; WeiQin, Tong ; JianQuan, Tang ; Bo, Wang

  • Author_Institution
    Sch. of Comput. Eng. & Sci., Shanghai Univ., China
  • fYear
    2003
  • fDate
    21-24 Aug. 2003
  • Firstpage
    457
  • Lastpage
    461
  • Abstract
    Grid appears as an effective technology coupling geographically distributed resources for solving large-scale problems in the wide area network. Fault tolerance in grid system is a significant and complex issue to secure a stable and reliable performance. Until now, various techniques exist for detecting and correcting faults in distributed computing systems. Unfortunately, few energy focus on fault-tolerance in grid environment, especially with the emergence of OGSA. A new fault-tolerant mechanism is needed to detect and recover service faults and nodes crash. Based on our previous work on Java threads state capturing and existing mobile agent techniques, we put forward a fault-tolerant mechanism providing effective fault-handling and recovering methods.
  • Keywords
    Java; fault tolerant computing; grid computing; mobile agents; multi-threading; system recovery; wide area networks; Java thread; distributed computing system; fault tolerance; geographically distributed resource; grid system; mobile agent technique; service fault recovery; wide area network; Computer crashes; Distributed computing; Fault detection; Fault tolerance; Fault tolerant systems; Java; Large-scale systems; Mobile agents; Wide area networks; Yarn;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Industrial Informatics, 2003. INDIN 2003. Proceedings. IEEE International Conference on
  • Print_ISBN
    0-7803-8200-5
  • Type

    conf

  • DOI
    10.1109/INDIN.2003.1300379
  • Filename
    1300379