DocumentCode :
1991265
Title :
On the use and implementation of message logging
Author :
Elnozahy, E.N. ; Zwaenepoel, W.
Author_Institution :
Sch. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA
fYear :
1994
fDate :
15-17 June 1994
Firstpage :
298
Lastpage :
307
Abstract :
We present a number of experiments showing that for compute-intensive applications executing in parallel on clusters of workstations, message logging has higher failure-free overhead than coordinated checkpointing. Message logging protocols, however, result in much shorter output latency than coordinated checkpointing. Therefore, message logging should be used for applications involving substantial interactions with the outside world, while coordinated checkpointing should be used otherwise. We also present an unorthodox message logging design that uses coordinated checkpointing with message logging, departing from the conventional approaches that use independent checkpointing. This combination of message logging and coordinated checkpointing offers several advantages, including improved failure-free performance, bounded recovery time, simplified garbage collection, and reduced complexity. Meanwhile, the new protocols retain the advantages of the conventional message logging protocols with respect to output commit. Finally, we discuss three "lessons learned" from an implementation of various message logging protocols.<>
Keywords :
fault tolerant computing; message passing; protocols; reliability; storage management; bounded recovery time; coordinated checkpointing; copy-on-write; failure-free overhead; failure-free performance; garbage collection; message logging; output commit; output latency; protocols; receiver-based logging; sender-based message logging; stable storage; Application software; Batteries; Checkpointing; Computer applications; Computer science; Concurrent computing; Costs; Delay; Protocols; Workstations;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fault-Tolerant Computing, 1994. FTCS-24. Digest of Papers., Twenty-Fourth International Symposium on
Conference_Location :
Austin, TX, USA
Print_ISBN :
0-8186-5520-8
Type :
conf
DOI :
10.1109/FTCS.1994.315630
Filename :
315630
Link To Document :
بازگشت