• DocumentCode
    1655897
  • Title

    A hybrid HW-SW approach for intermittent error mitigation in streaming-based embedded systems

  • Author

    Sabry, Mohamed M. ; Atienza, David ; Catthoor, Francky

  • Author_Institution
    Embedded Syst. Lab. (ESL), Ecole Polytech. Fed. de Lausanne (EPFL), Lausanne, Switzerland
  • fYear
    2012
  • Firstpage
    1110
  • Lastpage
    1113
  • Abstract
    Recent advances in process technology augment the systems-on-chip (SoCs) functionality per unit area with the substantial decrease of device features. However, features abatement triggers new reliability issues such as the single-event multi-bit upset (SMU) failure rates augmentation. To mitigate these failure rates, we propose a novel error mitigation mechanism that relies on a hybrid HW-SW technique. In our proposal, we enforce SoC SRAMs by implementing a fault-tolerant memory buffer with minimal capacity to ensure error-free operation. We utilize this buffer to temporarily store a portion of the stored data, named a data chunk, that is used to restore another data chunk in a fully demand-driven way, in case the latter is faulty. We formulate the buffer and data chunk size selection as an optimization problem that targets energy overhead minimization, given that timing and area overheads are restricted with hard constraints decided beforehand by the system designers. We show that our proposed mitigation scheme achieves full error mitigation in a real SoC platform with an average of 10.1% energy overhead with respect to a base-line system operation, while guaranteeing all the design-time constraints.
  • Keywords
    SRAM chips; buffer circuits; embedded systems; fault tolerance; integrated circuit reliability; minimisation; network synthesis; system-on-chip; HW-SW approach; SMU failure rate augmentation; SoC SRAM; base-line system operation; buffer size selection; data chunk restoration; data chunk size selection; data storage; design-time constraint; energy overhead; error mitigation mechanism; error-free operation; failure mitigation; fault-tolerant memory buffer; intermittent error mitigation; optimization problem; overhead minimization target; reliability; single-event multi-bit upset failure rates augmentation; streaming-based embedded system; systems-on-chip; Benchmark testing; Buffer storage; Error correction codes; Proposals; Reliability; System-on-a-chip; Timing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Design, Automation & Test in Europe Conference & Exhibition (DATE), 2012
  • Conference_Location
    Dresden
  • ISSN
    1530-1591
  • Print_ISBN
    978-1-4577-2145-8
  • Type

    conf

  • DOI
    10.1109/DATE.2012.6176661
  • Filename
    6176661