Title : 
Program fault tolerance based on memory access behavior
         
        
            Author : 
Bowen, N.S. ; Pradhan, D.K.
         
        
            Author_Institution : 
IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
         
        
        
        
        
        
            Abstract : 
Fault observability based on the behavior of the memory references is studied. As opposed to traditional studies that view memory as one large entity that must completely work to be considered reliable, this study emphasizes the usage patterns of a particular program´s memory. Expressions for the successful execution of a program that takes into account the usage of the data are developed. Three variations that depend on whether the program´s storage is pre-allocated, dynamically allocated, or constrained in allocation are presented. A theory is proposed to explain the phenomenon that increased workloads lead to increased failure rates, which has been observed in several studies. The model is used to study several program traces, and is shown that increased workloads could cause an increase of the observed failure rates in the range of 27% to 53%.<>
         
        
            Keywords : 
fault tolerant computing; program testing; fault observability; memory access behavior; memory references; program fault tolerance; program traces; Database systems; Delay; Error correction codes; Failure analysis; Fault tolerance; Observability; Performance analysis; Reliability;
         
        
        
        
            Conference_Titel : 
Fault-Tolerant Computing, 1991. FTCS-21. Digest of Papers., Twenty-First International Symposium
         
        
            Conference_Location : 
Montreal, Quebec, Canada
         
        
            Print_ISBN : 
0-8186-2150-8
         
        
        
            DOI : 
10.1109/FTCS.1991.146696