Title :
Real time scheduling of multiple executions of tasks to achieve fault tolerance in multiprocessor systems
Author :
Al-Asaad, Hussain
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of California, Davis, Davis, CA, USA
Abstract :
Modern computing systems are increasingly becoming more vulnerable to reliability issues due to both permanent (hard) and transient/intermittent (soft) errors. Various techniques have been proposed to incorporate redundancy into the hardware or software in order to achieve the desired fault tolerance. We present a technique that allows a task to be executed multiple times on multiprocessor systems to achieve the desired confidence in the computed results. In order to perform the multiple executions of tasks coming in real time, we developed a scheduling algorithm to increase the utilization ratio of existing hardware and decrease the time from receiving the task until its completion. Our algorithm schedules the executions of tasks on various microprocessors so the desired level of reliability is achieved for every coming task. Our preliminary results show that our scheduling technique is highly effective, feasible, and promising.
Keywords :
fault tolerant computing; processor scheduling; reliability; hard errors; multiprocessor systems; permanent errors; real time scheduling; soft errors; task fault tolerance; time redundancy; transient-intermittent errors; Circuit faults; Hardware; Processor scheduling; Redundancy; Fault tolerance; multiprocessor systems; task scheduling; time redundancy;
Conference_Titel :
AUTOTESTCON, 2014 IEEE
Conference_Location :
St. Louis, MO
Print_ISBN :
978-1-4799-3389-1
DOI :
10.1109/AUTEST.2014.6935165