• DocumentCode
    3064093
  • Title

    Selective checkpointing and rollbacks in multithreaded distributed systems

  • Author

    Kasbekar, Mangesh ; Das, Chita R.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Pennsylvania State Univ., University Park, PA, USA
  • fYear
    2001
  • fDate
    36982
  • Firstpage
    39
  • Lastpage
    46
  • Abstract
    Modern distributed systems are often multithreaded and object-oriented in their design. They require efficient techniques to checkpoint and restore their state for improving fault-tolerance properties. The traditional process-based techniques of distributed checkpointing and rollback algorithms suffer from the problem of false dependencies, which makes them very rigid and inefficient for use with modern systems. In this paper, we develop protocols that can selectively checkpoint (and rollback) some threads of a distributed system while leaving others untouched, and yet ensuring the consistency of state resulting from such a partial rollback
  • Keywords
    multi-threading; protocols; software fault tolerance; system recovery; distributed checkpointing; false dependencies; fault tolerance; multithreaded distributed systems; object-oriented design; partial rollback; process-based techniques; protocols; selective checkpointing; selective rollbacks; state consistency; state restoration; Checkpointing; Computer science; Design engineering; Fault tolerance; Fault tolerant systems; Modems; Multithreading; Programming profession; Protocols; Yarn;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Distributed Computing Systems, 2001. 21st International Conference on.
  • Conference_Location
    Mesa, AZ
  • Print_ISBN
    0-7695-1077-9
  • Type

    conf

  • DOI
    10.1109/ICDSC.2001.918931
  • Filename
    918931