DocumentCode
2923591
Title
BLOCKWATCH: Leveraging similarity in parallel programs for error detection
Author
Wei, Jiesheng ; Pattabiraman, Karthik
Author_Institution
Dept. of Electr. & Comput. Eng., Univ. of British Columbia (UBC), Vancouver, BC, Canada
fYear
2012
fDate
25-28 June 2012
Firstpage
1
Lastpage
12
Abstract
The scaling of Silicon devices has exacerbated the unreliability of modern computer systems, and power constraints have necessitated the involvement of software in hardware error detection. Simultaneously, the multi-core revolution has impelled software to become parallel. Therefore, there is a compelling need to protect parallel programs from hardware errors. Parallel programs´ tasks have significant similarity in control data due to the use of high-level programming models. In this study, we propose BLOCKWATCH to leverage the similarity in parallel program´s control data for detecting hardware errors. BLOCKWATCH statically extracts the similarity among different threads of a parallel program and checks the similarity at runtime. We evaluate BLOCKWATCH on seven SPLASH-2 benchmarks to measure its performance overhead and error detection coverage. We find that BLOCKWATCH incurs an average overhead of 16% across all programs, and provides an average SDC coverage of 97% for faults in the control data.
Keywords
error detection; multiprocessing systems; parallel programming; program diagnostics; software performance evaluation; software reliability; BLOCKWATCH; SPLASH-2 benchmarks; Silicon devices; computer systems; hardware error detection; high-level programming models; multicore revolution; parallel program protection; similarity leverage; static analysis; Computers; Hardware; Instruction sets; Multicore processing; Runtime; Control-data; Parallel programs; Runtime checks; SPMD; Static Analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Dependable Systems and Networks (DSN), 2012 42nd Annual IEEE/IFIP International Conference on
Conference_Location
Boston, MA
ISSN
1530-0889
Print_ISBN
978-1-4673-1624-8
Electronic_ISBN
1530-0889
Type
conf
DOI
10.1109/DSN.2012.6263959
Filename
6263959
Link To Document