Title :
Principles of fault tolerance
Author :
White, Robert V. ; Miles, F. Marshall
Author_Institution :
AT&T Bell Labs., Dallas, TX, USA
Abstract :
The demand for continuously available electronic systems increases every day. Transaction processing, communications systems, and critical processes all require nonstop, fault tolerant operation. Creating a fault tolerant or highly available system can be achieved by following four basic principles: redundancy, fault isolation, fault detection and annunciation, and on-line repair. This paper is a tutorial that presents those four principles after reviewing some fundamentals of reliability and availability. It concludes with an expanded discussion on implementing redundancy. Special considerations for high availability and fault tolerance in distributed power systems are highlighted
Keywords :
fault diagnosis; fault location; power electronics; redundancy; reliability; availability; continuously available electronic systems; distributed power systems; fault annunciation; fault detection; fault isolation; fault tolerance; high availability; on-line repair; redundancy; reliability; Availability; Circuit faults; Fault tolerance; Fault tolerant systems; Industrial power systems; Military computing; Power system faults; Power system management; Redundancy; Telephony;
Conference_Titel :
Applied Power Electronics Conference and Exposition, 1996. APEC '96. Conference Proceedings 1996., Eleventh Annual
Conference_Location :
San Jose, CA
Print_ISBN :
0-7803-3044-7
DOI :
10.1109/APEC.1996.500416