DocumentCode
2367098
Title
An Asynchronous Checkpoint-Based Redundant Multithreading Architecture
Author
Yin, Jie ; Jiang, Jianhui
Author_Institution
Dept. of Comput. Sci. & Technol., Tongji Univ., Shanghai, China
fYear
2010
fDate
13-15 Dec. 2010
Firstpage
243
Lastpage
244
Abstract
Existing redundant multithreading (RMT) detects faults by comparing the result of each instruction between the master and slave threads, which can lead to huge comparison and communication overhead. To address this problem, the checkpoint-based RMT (like RVQ_F) was proposed, but in such architectures, master threads must wait for slave threads to arrive at the same position at each checkpoint, this may delay the release of resources occupied by master threads and decrease performance. This paper proposes an asynchronous checkpoint-based redundant multithreading architecture (AC-RMT), in which two context saving rooms are set aside for each thread, one for detecting faults, and the other for saving the last checkpoint used for fault restoration. Compared with RVQ_F, AC-RMT efficiently boosts performance because, by avoiding the waiting of master threads at checkpoints, resources can be released timely.
Keywords
checkpointing; fault tolerant computing; multi-threading; parallel architectures; redundancy; AC-RMT architecture; asynchronous checkpoint; fault detection; fault restoration; master thread; redundant multithreading architecture; slave thread; SMT; checkpoint; fault detection; fault-tolerance; redundant multithreading;
fLanguage
English
Publisher
ieee
Conference_Titel
Dependable Computing (PRDC), 2010 IEEE 16th Pacific Rim International Symposium on
Conference_Location
Tokyo
Print_ISBN
978-1-4244-8975-6
Electronic_ISBN
978-0-7695-4289-8
Type
conf
DOI
10.1109/PRDC.2010.27
Filename
5703258
Link To Document