Title :
Soft Error Resiliency Characterization on IBM BlueGene/Q Processor
Author :
Chen-Yong Cher ; Muller, K. Paul ; Haring, Ruud A. ; Satterfield, David L. ; Musta, Thomas E. ; Gooding, Thomas M. ; Davis, Kristan D. ; Dombrowa, Marc B. ; Kopcsay, Gerard V. ; Senger, Robert M. ; Sugawara, Yoko ; Sugavanam, Krishnan
Abstract :
Soft Error Resiliency (SER) is a major concern for Petascale high performance computing (HPC) systems. In designing Blue Gene/Q (BG/Q) [8], many mechanisms were deployed to target SER including extensive use of Silicon-On-Insulator (SOI), radiation-hardened latches [7,13], detection and correction in on-chip arrays, and very low radiation packaging materials. On the other hand, it is well known that application behavior has major impacts on the masking (or “derating” factor) in system SER calculations. The principal goal of this project is to understand the interaction between BG/Q hardware and high-performance applications when it comes to SER by performing and evaluating a chip irradiation experiment.
Keywords :
fault tolerant computing; microprocessor chips; multiprocessing systems; parallel processing; HPC systems; IBM BlueGene processor; IBM Q processor; SER calculations; SER characterization; SOI; application behavior; chip irradiation experiment; derating factor; masking factor; on-chip arrays; petascale high performance computing systems; radiation-hardened latches; silicon-on-insulator; soft error resiliency characterization; very low radiation packaging materials; Benchmark testing; Circuit faults; Hardware; Latches; Neutrons; Packaging; Radiation effects;
Conference_Titel :
Design Automation Conference (ASP-DAC), 2014 19th Asia and South Pacific
Conference_Location :
Singapore
DOI :
10.1109/ASPDAC.2014.6742920