Title :
Analysis of checkpointing schemes with task duplication
Author :
Ziv, Avi ; Bruck, Jehoshua
Author_Institution :
Res. Lab., IBM Israel Sci. & Technol. Center, Haifa, Israel
fDate :
2/1/1998 12:00:00 AM
Abstract :
The paper suggests a technique for analyzing the performance of checkpointing schemes with task duplication. We show how this technique can be used to derive the average execution time of a task and other important parameters related to the performance of checkpointing schemes. The analysis results are used to study and compare the performance of four existing checkpointing schemes. Our comparison results show that, in general, the number of processors used, not the complexity of the scheme, has the most effect on the scheme performance
Keywords :
Markov processes; parallel programming; software fault tolerance; system recovery; Markov Reward Model; average execution time; checkpointing scheme performance; fault tolerance; parallel computing; processors; task duplication; Checkpointing; Concurrent computing; Costs; Fault detection; Fault tolerance; Fault tolerant systems; Hardware; Parallel processing; Performance analysis; Redundancy;
Journal_Title :
Computers, IEEE Transactions on