Title :
An Asynchronous Checkpoint-Based Redundant Multithreading Architecture
Author :
Yin, Jie ; Jiang, Jianhui
Author_Institution :
Dept. of Comput. Sci. & Technol., Tongji Univ., Shanghai, China
Abstract :
Existing redundant multithreading (RMT) detects faults by comparing the result of each instruction between the master and slave threads, which can lead to huge comparison and communication overhead. To address this problem, the checkpoint-based RMT (like RVQ_F) was proposed, but in such architectures, master threads must wait for slave threads to arrive at the same position at each checkpoint, this may delay the release of resources occupied by master threads and decrease performance. This paper proposes an asynchronous checkpoint-based redundant multithreading architecture (AC-RMT), in which two context saving rooms are set aside for each thread, one for detecting faults, and the other for saving the last checkpoint used for fault restoration. Compared with RVQ_F, AC-RMT efficiently boosts performance because, by avoiding the waiting of master threads at checkpoints, resources can be released timely.
Keywords :
checkpointing; fault tolerant computing; multi-threading; parallel architectures; redundancy; AC-RMT architecture; asynchronous checkpoint; fault detection; fault restoration; master thread; redundant multithreading architecture; slave thread; SMT; checkpoint; fault detection; fault-tolerance; redundant multithreading;
Conference_Titel :
Dependable Computing (PRDC), 2010 IEEE 16th Pacific Rim International Symposium on
Conference_Location :
Tokyo
Print_ISBN :
978-1-4244-8975-6
Electronic_ISBN :
978-0-7695-4289-8
DOI :
10.1109/PRDC.2010.27