Title : 
Level of confidence evaluation and its usage for Roll-back Recovery with Checkpointing optimization
         
        
            Author : 
Nikolov, Dimitar ; Ingelsson, Urban ; Singh, Virendra ; Larsson, Erik
         
        
            Author_Institution : 
Dept. of Comput. Sci., Linkoping Univ., Linköping, Sweden
         
        
        
        
        
        
            Abstract : 
Increasing soft error rates for semiconductor devices manufactured in later technologies enforces the use of fault tolerant techniques such as Roll-back Recovery with Checkpointing (RRC). However, RRC introduces time overhead that increases the completion (execution) time. For non-real-time systems, research have focused on optimizing RRC and shown that it is possible to find the optimal number of checkpoints such that the average execution time is minimal. While minimal average execution time is important, it is for real-time systems important to provide a high probability that deadlines are met. Hence, there is a need of probabilistic guarantees that jobs employing RRC complete before a given deadline. First, we present a mathematical framework for the evaluation of level of confidence, the probability that a given deadline is met, when RRC is employed. Second, we present an optimization method for RRC that finds the number of checkpoints that results in the minimal completion time while the minimal completion time satisfies a given level of confidence requirement. Third, we use the proposed framework to evaluate probabilistic guarantees for RRC optimization in non-real-time systems.
         
        
            Keywords : 
checkpointing; fault tolerance; optimisation; probability; real-time systems; semiconductor device manufacture; checkpointing optimization; confidence evaluation; fault tolerant technique; roll back recovery; semiconductor device; soft error rate; Checkpointing; Fault tolerance; Fault tolerant systems; Measurement; Probability distribution; Program processors;
         
        
        
        
            Conference_Titel : 
Dependable Systems and Networks Workshops (DSN-W), 2011 IEEE/IFIP 41st International Conference on
         
        
            Conference_Location : 
Hong Kong
         
        
            Print_ISBN : 
978-1-4577-0374-4
         
        
            Electronic_ISBN : 
978-1-4577-0373-7
         
        
        
            DOI : 
10.1109/DSNW.2011.5958836