DocumentCode :
703946
Title :
Improving MPSoC reliability through adapting runtime task schedule based on time-correlated fault behavior
Author :
Rozo Duque, Laura A. ; Monsalve Diaz, Jose M. ; Chengmo Yang
Author_Institution :
Electr. & Comput. Eng., Univ. of Delaware, Newark, DE, USA
fYear :
2015
fDate :
9-13 March 2015
Firstpage :
818
Lastpage :
823
Abstract :
The increasing susceptibility of multicore systems to temperature variations, environmental issues and different aging effects has made system reliability a crucial concern. Unpredictability of all these factors makes fault behavior diverse in nature, which should be considered by the runtime task scheduler to improve overall system reliability. To achieve this goal, this paper proposes a fault tolerant approach to model core reliability at runtime and tune resource allocation accordingly. Given variations in fault duration, we propose a reliability model capable of tracking not only faults appeared in each core but also their correlation in time. Taking this model as an input, a runtime scheduling algorithm that allocates critical and vulnerable tasks to reliable cores is also proposed. Experimental results show that the proposed adaptive technique delivers up to 56% improvement in application execution time compared to other techniques.
Keywords :
fault tolerance; integrated circuit reliability; multiprocessing systems; resource allocation; scheduling; system-on-chip; MPSoC reliability; aging effects; environmental issues; fault duration; fault tolerant approach; multicore systems; resource allocation; runtime scheduling algorithm; runtime task schedule; temperature variations; time-correlated fault behavior; Adaptation models; Fault tolerance; Fault tolerant systems; Resource management; Runtime; Schedules;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Design, Automation & Test in Europe Conference & Exhibition (DATE), 2015
Conference_Location :
Grenoble
Print_ISBN :
978-3-9815-3704-8
Type :
conf
Filename :
7092498
Link To Document :
بازگشت