An Efficient Fault Recovery Algorithm in Multiprocessor Mixed-Criticality Systems

Author

Guangdong Liu ; Ying Lu ; Shige Wang

Author_Institution

Dept. of Comput. Sci. & Eng., Univ. of Nebraska-Lincoln, Lincoln, NE, USA

fYear

2013

fDate

13-15 Nov. 2013

Firstpage

2006

Lastpage

2013

Abstract

Recent years, there is an increasing interest of integrating mixed-criticality functionalities onto a shared computing platform in automotive, avionics and the control industry. The benefits of such an integration include reduced hardware cost and high computational performance. Also, new challenges appear as a result of the integration since interferences across tasks with different criticalities are introduced and these interferences could potentially lead to catastrophic results. Failures are likely to be more frequent due to the interferences. Hence, it is becoming increasingly important to deal with faults in mixed-criticality systems. Although several approaches have been proposed to handle failures in mixed-criticality systems, they come either with a high cost due to a hardware replication (spatial redundancy) or with a poor utilization due to re-execution (time redundancy). In this paper, we study a scheme that provides fault recovery through task reallocations in response to permanent faults in multiprocessor mixed-criticality systems. We present an algorithm to minimize the number of task reallocations while retaining the promise that the most critical applications continue to meet their deadlines. The performance evaluation of the proposed algorithm is carried out by comparing it with two baseline algorithms. In order to evaluate the performance of algorithms from the perspective of mixed-criticality systems, we choose the state of art metric called ductility to formally measure the effects of deadline misses for tasks with different criticality levels. Under this metric, a high-criticality task is considered more important than all low-criticality tasks combined. The simulation results confirm the effectiveness of our proposed algorithm in both minimizing the number of task reallocations and retaining the promised performance of high-criticality tasks.

Keywords

fault tolerant computing; multiprocessing systems; redundancy; system recovery; task analysis; criticality level; deadline; ductility; failure handling; fault recovery algorithm; formal measure; hardware replication; multiprocessor mixed criticality system; performance evaluation; shared computing; task reallocation; time redundancy; Noise measurement; Partitioning algorithms; Processor scheduling; Redundancy; Resource management; Scheduling; fault recovery; multiprocessor mixed-criticality real-time systems; task reallocation;

fLanguage

English

Publisher

ieee

Conference_Titel

High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC), 2013 IEEE 10th International Conference on

Conference_Location

Zhangjiajie

Type

conf

DOI

10.1109/HPCC.and.EUC.2013.289

Filename

6832172

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=688398