DocumentCode :
63753
Title :
Dependability analysis for fault-tolerant computer systems using dynamic fault graphs
Author :
Zhao Feng ; Jin Hai ; Zou Deqing ; Qin Pan
Author_Institution :
Sch. of Comput. Sci. & Technol., Huazhong Univ. of Sci. & Technol., Wuhan, China
Volume :
11
Issue :
9
fYear :
2014
fDate :
Sept. 2014
Firstpage :
16
Lastpage :
30
Abstract :
Dependability analysis is an important step in designing and analyzing safety computer systems and protection systems. Introducing multi-processor and virtual machine increases the system faults´ complexity, diversity and dynamic, in particular for software-induced failures, with an impact on the overall dependability. Moreover, it is very different for safety system to operate successfully at any active phase, since there is a huge difference in failure rate between hardware-induced and softwareinduced failures. To handle these difficulties and achieve accurate dependability evaluation, consistently reflecting the construct it measures, a new formalism derived from dynamic fault graphs (DFG) is developed in this paper. DFG exploits the concept of system event as fault state sequences to represent dynamic behaviors, which allows us to execute probabilistic measures at each timestamp when change occurs. The approach automatically combines the reliability analysis with the system dynamics. In this paper, we describe how to use the proposed methodology drives to the overall system dependability analysis through the phases of modeling, structural discovery and probability analysis, which is also discussed using an example of a virtual computing system.
Keywords :
fault tolerant computing; graph theory; probability; DFG; active phase; dynamic behavior representation; dynamic fault graphs; failure rate; fault state sequences; fault-tolerant computer systems; hardware-induced failures; modeling phase; multiprocessors; probabilistic measures; probability analysis phase; protection system analysis; protection system design; reliability analysis; safety computer system analysis; safety computer system design; safety system; software- induced failures; software-induced failures; structural discovery phase; system dependability analysis; system dynamics; system event; system fault complexity; system fault diversity; system fault dynamic; timestamp; virtual computing system; virtual machine; Computational modeling; Fault tolerance; Fault tolerant systems; Logic gates; Markov processes; Probabilistic logic; dependability analysis; dynamic fault-graph; fault-tolerant system; probability forecast; structural link;
fLanguage :
English
Journal_Title :
Communications, China
Publisher :
ieee
ISSN :
1673-5447
Type :
jour
DOI :
10.1109/CC.2014.6969708
Filename :
6969708
Link To Document :
بازگشت