A heterogeneous built-in self-repair approach using system-level synthesis flexibility

Author

Hong, Inki ; Potkonjak, Miodrag ; Karri, Ramesh

Author_Institution

Synopsys Inc., Hillsboro, OR, USA

Volume

53

Issue

1

fYear

2004

fDate

3/1/2004 12:00:00 AM

Firstpage

93

Lastpage

101

Abstract

Summary and Conclusions -A novel methodology is proposed for designing fault-tolerant real-time multi-processor systems-on-a-chip to achieve optimal productivity. The methodology employs the heterogeneous built-in-self-repair (BISR) based on graceful degradation and yield enhancement techniques as an embedded optimization engine. The technique exploits the flexibility provided in task-level scheduling and algorithm selection steps. A hardware fault model is developed for modern super-scalar processors and multi-processors which enables an efficient treatment of the synthesis and compilation goals. For the first time, heterogeneous BISR is used at the task level. The key idea is to adapt scheduling and algorithm selection to the available nonfaulty resources. If there is a fault in memory, the algorithms that use less memory are selected and the scheduler exploits the other abundant resource, viz, the processors, more vigorously to compensate for the loss of part of memory. Similarly, a fault in a processor is backed up by memory. The synthesis approach minimizes the degradation in performance for single or multiple faults using simulated annealing-based algorithm selection, scheduling, and assignment algorithms. On the large set of examples this adaptive algorithm selection and scheduling technique has achieved important improvement of throughput compared to conventional nonadaptive schemes. The experimental results also indicate that important improvement in productivity can be achieved by using the extra throughput gained from the technique.

Keywords

adaptive scheduling; built-in self test; fault tolerance; multiprocessing systems; simulated annealing; BISR; adaptive algorithm selection; built-in self-repair approach; embedded optimization engine; fault-tolerant system; graceful degradation; hardware fault model; multiprocessor systems; simulated annealing; superscalar processors; system-level synthesis; task-level scheduling; yield enhancement techniques; Degradation; Engines; Fault tolerant systems; Hardware; Optimization methods; Processor scheduling; Productivity; Real time systems; Scheduling algorithm; Throughput;

fLanguage

English

Journal_Title

Reliability, IEEE Transactions on

Publisher

ieee

ISSN

0018-9529

Type

jour

DOI

10.1109/TR.2003.819047

Filename

1282166