Title :
FINE: A Fully Informed aNd Efficient Communication-Induced Checkpointing Protocol
Author :
Luo, Yi ; Manivannan, D.
Author_Institution :
Dept. of Comput. Sci., Kentucky Univ., Lexington, KY
Abstract :
In this paper, first we discuss two critical data structures used in the communication-induced checkpointing (CIC) protocols and their distinct roles in guaranteeing z- cycle free (ZCF) property by tracking the checkpoint and communication pattern (CCPAT) in a distributed computation that can lead to Z-cycles and preventing them. Then, we provide our Transitive Dependency Enabled TimeStamp (TDE_TSS) mechanism by which we can both timestamp each event and get the transitive dependency information upon receiving a message. Finally, based on this times- tamping mechanism, we present our Fully Informed aNd Efficient (FINE) checkpointing algorithm which can not only improve the performance of Fully Informed (FI) CIC protocol proposed by Helary et al. but also decrease the overhead of piggybacked information.
Keywords :
checkpointing; distributed processing; checkpoint and communication pattern; communication-induced checkpointing protocol; data structures; piggybacked information; timestamping mechanism; transitive dependency enabled timestamp mechanism; z-cycle free property; Algorithm design and analysis; Checkpointing; Computational modeling; Computer science; Data structures; Distributed algorithms; Distributed computing; Information analysis; Mechanical factors; Protocols; Distributed systems; communication-induced checkpointing protocols; consistent global checkpoints;
Conference_Titel :
Systems, 2008. ICONS 08. Third International Conference on
Conference_Location :
Cancun
Print_ISBN :
978-0-7695-3105-2
Electronic_ISBN :
978-0-7695-3105-2
DOI :
10.1109/ICONS.2008.14