DocumentCode :
3133436
Title :
Fault tolerance on multicore processors using deterministic multithreading
Author :
Mushtaq, Hamid ; Al-Ars, Zaid ; Bertels, Koen
Author_Institution :
Comput. Eng. Lab., Delft Univ. of Technol., Delft, Netherlands
fYear :
2013
fDate :
16-18 Dec. 2013
Firstpage :
1
Lastpage :
6
Abstract :
This paper describes a software based fault tolerance approach for multithreaded programs running on multicore processors. Redundant multithreaded processes are used to detect soft errors and recover from them. Our scheme makes sure that the execution of the redundant processes is identical even in the presence of non-determinism due to shared memory accesses. This is done by making sure that the redundant processes acquire the locks for accessing the shared memory in the same order. Instead of using record/replay technique to do that, our scheme is based on deterministic multithreading, meaning that for the same input, a multithreaded program always have the same lock interleaving. Unlike record/replay systems, this eliminates the requirement for communication between the redundant processes. Moreover, our scheme is implemented totally in software, requiring no special hardware, making it very portable. Furthermore, our scheme is totally implemented at user-level, requiring no modification of the kernel. For selected benchmarks, our scheme adds an average overhead of 49% for 4 threads.
Keywords :
multi-threading; multiprocessing systems; software fault tolerance; deterministic multithreading; lock interleaving; multicore processors; multithreaded program; multithreaded programs; record-replay technique; redundant multithreaded process; redundant process execution; shared memory access; soft error detection; soft error recovery; software-based fault tolerance approach; user-level; Clocks; Fault tolerance; Fault tolerant systems; Hardware; Instruction sets; Memory management; Optimization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Design and Test Symposium (IDT), 2013 8th International
Conference_Location :
Marrakesh
Type :
conf
DOI :
10.1109/IDT.2013.6727107
Filename :
6727107
Link To Document :
بازگشت