DocumentCode
63753
Title
Dependability analysis for fault-tolerant computer systems using dynamic fault graphs
Author
Zhao Feng ; Jin Hai ; Zou Deqing ; Qin Pan
Author_Institution
Sch. of Comput. Sci. & Technol., Huazhong Univ. of Sci. & Technol., Wuhan, China
Volume
11
Issue
9
fYear
2014
fDate
Sept. 2014
Firstpage
16
Lastpage
30
Abstract
Dependability analysis is an important step in designing and analyzing safety computer systems and protection systems. Introducing multi-processor and virtual machine increases the system faults´ complexity, diversity and dynamic, in particular for software-induced failures, with an impact on the overall dependability. Moreover, it is very different for safety system to operate successfully at any active phase, since there is a huge difference in failure rate between hardware-induced and softwareinduced failures. To handle these difficulties and achieve accurate dependability evaluation, consistently reflecting the construct it measures, a new formalism derived from dynamic fault graphs (DFG) is developed in this paper. DFG exploits the concept of system event as fault state sequences to represent dynamic behaviors, which allows us to execute probabilistic measures at each timestamp when change occurs. The approach automatically combines the reliability analysis with the system dynamics. In this paper, we describe how to use the proposed methodology drives to the overall system dependability analysis through the phases of modeling, structural discovery and probability analysis, which is also discussed using an example of a virtual computing system.
Keywords
fault tolerant computing; graph theory; probability; DFG; active phase; dynamic behavior representation; dynamic fault graphs; failure rate; fault state sequences; fault-tolerant computer systems; hardware-induced failures; modeling phase; multiprocessors; probabilistic measures; probability analysis phase; protection system analysis; protection system design; reliability analysis; safety computer system analysis; safety computer system design; safety system; software- induced failures; software-induced failures; structural discovery phase; system dependability analysis; system dynamics; system event; system fault complexity; system fault diversity; system fault dynamic; timestamp; virtual computing system; virtual machine; Computational modeling; Fault tolerance; Fault tolerant systems; Logic gates; Markov processes; Probabilistic logic; dependability analysis; dynamic fault-graph; fault-tolerant system; probability forecast; structural link;
fLanguage
English
Journal_Title
Communications, China
Publisher
ieee
ISSN
1673-5447
Type
jour
DOI
10.1109/CC.2014.6969708
Filename
6969708
Link To Document