DocumentCode
3064093
Title
Selective checkpointing and rollbacks in multithreaded distributed systems
Author
Kasbekar, Mangesh ; Das, Chita R.
Author_Institution
Dept. of Comput. Sci. & Eng., Pennsylvania State Univ., University Park, PA, USA
fYear
2001
fDate
36982
Firstpage
39
Lastpage
46
Abstract
Modern distributed systems are often multithreaded and object-oriented in their design. They require efficient techniques to checkpoint and restore their state for improving fault-tolerance properties. The traditional process-based techniques of distributed checkpointing and rollback algorithms suffer from the problem of false dependencies, which makes them very rigid and inefficient for use with modern systems. In this paper, we develop protocols that can selectively checkpoint (and rollback) some threads of a distributed system while leaving others untouched, and yet ensuring the consistency of state resulting from such a partial rollback
Keywords
multi-threading; protocols; software fault tolerance; system recovery; distributed checkpointing; false dependencies; fault tolerance; multithreaded distributed systems; object-oriented design; partial rollback; process-based techniques; protocols; selective checkpointing; selective rollbacks; state consistency; state restoration; Checkpointing; Computer science; Design engineering; Fault tolerance; Fault tolerant systems; Modems; Multithreading; Programming profession; Protocols; Yarn;
fLanguage
English
Publisher
ieee
Conference_Titel
Distributed Computing Systems, 2001. 21st International Conference on.
Conference_Location
Mesa, AZ
Print_ISBN
0-7695-1077-9
Type
conf
DOI
10.1109/ICDSC.2001.918931
Filename
918931
Link To Document