DocumentCode :
523599
Title :
Verification for fault tolerance of the IBM system z microprocessor
Author :
Thompto, Brian W. ; Hoppe, Bodo
Author_Institution :
Syst. & Technol. Group, IBM, Austin, TX, USA
fYear :
2010
fDate :
13-18 June 2010
Firstpage :
525
Lastpage :
530
Abstract :
IBM System z processors are known for their industry leading Reliability, Availability and Serviceability (RAS). The hardware is designed to support a high resilience against errors and the ability to recover from errors maintaining a valid architectural state. This paper describes the thorough verification effort required to prove that the fault tolerance of the IBM System z processor core matches the high expectations prior to design tape-out. This paper proposes a multifaceted verification methodology to cover the various aspects of verifying correct error detection, isolation and recovery. Soft errors enlarge the state space of a design significantly. This provides a significant challenge to the functional verification environment in order to tolerate the fails and to expect architectural compliance. Several fault injection mechanisms are discussed. A special focus is on the novel methodology of Comprehensive Fault Injection (CFI) used to validate and improve the dependability characteristics of the processor core, providing improved Soft Error Resilience (SER). Feedback of the results and measurements of the efficiency and functional coverage are an integral part of the overall methodology, allowing the smart use of the available compute resources.
Keywords :
Decision support systems; Fault tolerant systems; Microprocessors; CFI; RAS; SER; error detection; error recovery; fault injection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Design Automation Conference (DAC), 2010 47th ACM/IEEE
Conference_Location :
Anaheim, CA, USA
ISSN :
0738-100X
Print_ISBN :
978-1-4244-6677-1
Type :
conf
Filename :
5522658
Link To Document :
بازگشت