DocumentCode
1655897
Title
A hybrid HW-SW approach for intermittent error mitigation in streaming-based embedded systems
Author
Sabry, Mohamed M. ; Atienza, David ; Catthoor, Francky
Author_Institution
Embedded Syst. Lab. (ESL), Ecole Polytech. Fed. de Lausanne (EPFL), Lausanne, Switzerland
fYear
2012
Firstpage
1110
Lastpage
1113
Abstract
Recent advances in process technology augment the systems-on-chip (SoCs) functionality per unit area with the substantial decrease of device features. However, features abatement triggers new reliability issues such as the single-event multi-bit upset (SMU) failure rates augmentation. To mitigate these failure rates, we propose a novel error mitigation mechanism that relies on a hybrid HW-SW technique. In our proposal, we enforce SoC SRAMs by implementing a fault-tolerant memory buffer with minimal capacity to ensure error-free operation. We utilize this buffer to temporarily store a portion of the stored data, named a data chunk, that is used to restore another data chunk in a fully demand-driven way, in case the latter is faulty. We formulate the buffer and data chunk size selection as an optimization problem that targets energy overhead minimization, given that timing and area overheads are restricted with hard constraints decided beforehand by the system designers. We show that our proposed mitigation scheme achieves full error mitigation in a real SoC platform with an average of 10.1% energy overhead with respect to a base-line system operation, while guaranteeing all the design-time constraints.
Keywords
SRAM chips; buffer circuits; embedded systems; fault tolerance; integrated circuit reliability; minimisation; network synthesis; system-on-chip; HW-SW approach; SMU failure rate augmentation; SoC SRAM; base-line system operation; buffer size selection; data chunk restoration; data chunk size selection; data storage; design-time constraint; energy overhead; error mitigation mechanism; error-free operation; failure mitigation; fault-tolerant memory buffer; intermittent error mitigation; optimization problem; overhead minimization target; reliability; single-event multi-bit upset failure rates augmentation; streaming-based embedded system; systems-on-chip; Benchmark testing; Buffer storage; Error correction codes; Proposals; Reliability; System-on-a-chip; Timing;
fLanguage
English
Publisher
ieee
Conference_Titel
Design, Automation & Test in Europe Conference & Exhibition (DATE), 2012
Conference_Location
Dresden
ISSN
1530-1591
Print_ISBN
978-1-4577-2145-8
Type
conf
DOI
10.1109/DATE.2012.6176661
Filename
6176661
Link To Document