Title :
Toward Holistic Soft-Error-Resilient Shared-Memory Multicores
Author :
Qingchuan Shi ; Khan, Omar
Abstract :
A proposed lightweight, soft-error-resilient architecture for shared-memory multicores enables cores to autonomously perform redundant execution of uninterrupted instruction sequences. The distributed redundancy control mechanism operates in concert with the coherence protocol to provide resiliency for both computation and communication hardware. The Web extra at http://youtu.be/9A3oiIerI0w is a video interview in which guest editor Srinivas Devadas and author Omer Khan expand on how a proposed lightweight, soft-error-resilient architecture for shared-memory multicores enables cores to autonomously perform redundant execution of uninterrupted instruction sequences.
Keywords :
computer architecture; protocols; redundancy; shared memory systems; autonomous redundant execution; coherence protocol; communication hardware; computation hardware; distributed redundancy control mechanism; holistic soft-error-resilient shared-memory multicores; lightweight soft-error-resilient architecture; uninterrupted instruction sequences; Amplitude modulation; Multicore processing; Program processors; Redundancy; Soft errors; hardware resiliency; multicore processors; redundant execution; shared memory; soft errors;
DOI :
10.1109/MC.2013.262