DocumentCode :
1554951
Title :
Concurrent detection of software and hardware data-access faults
Author :
Wilken, Kent D. ; Kong, Timothy
Author_Institution :
Dept. of Electr. & Comput. Eng., California Univ., Davis, CA, USA
Volume :
46
Issue :
4
fYear :
1997
fDate :
4/1/1997 12:00:00 AM
Firstpage :
412
Lastpage :
424
Abstract :
A new approach allows low-cost concurrent detection of two important types of faults, software and hardware data-access faults, using an extension of the existing signature monitoring approach. The proposed approach detects data-access faults using a new type of redundant data structure that contains an embedded signature. Low-cast fault detection is achieved using simple architecture support and compiler support that exploit natural redundancies in the data structures, in the instruction set architecture, and in the data-access mechanism. The software data-access faults that the approach can detect include faults that have been shown to cause a high percentage of system failures. Hardware data-access faults that occur in all levels of the data-memory hierarchy are also detectable, including faults in the register file, the data cache, the data-cache TLB, the memory address and data buses, etc. Benchmark results for the MIPS R300D processor executing code scheduled by a modified GNU C Compiler show that the new approach can concurrently check a high percentage of data accesses, while causing little performance overhead and little memory overhead
Keywords :
data structures; fault tolerant computing; software fault tolerance; MIPS R300D processor; architecture support; benchmark results; compiler support; data buses; data cache; data-cache TLB; data-memory hierarchy; embedded signature; hardware data-access faults concurrent detection; instruction set architecture; memory address; memory overhead; performance overhead; redundant data structure; register file; signature monitoring; software data-access faults concurrent detection; system failures; Computer architecture; Computer errors; Condition monitoring; Costs; Data structures; Error correction; Fault detection; Hardware; Information entropy; Redundancy;
fLanguage :
English
Journal_Title :
Computers, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9340
Type :
jour
DOI :
10.1109/12.588046
Filename :
588046
Link To Document :
بازگشت