DocumentCode
1455253
Title
An adaptive algorithm for tolerating value faults and crash failures
Author
Ren, Yansong Jennifer ; Cukier, Michel ; Sanders, William H.
Author_Institution
Center for Reliable & High Performance Comput., Illinois Univ., Urbana, IL, USA
Volume
12
Issue
2
fYear
2001
fDate
2/1/2001 12:00:00 AM
Firstpage
173
Lastpage
192
Abstract
The AQuA architecture provides adaptive fault tolerance to CORBA applications by replicating objects and providing a high-level method that an application can use to specify its desired level of dependability. This paper presents the algorithms that AQUA uses, when an application´s dependability requirements can change at runtime, to tolerate both value faults in applications and crash failures simultaneously. In particular, we provide an active replication communication scheme that maintains data consistency among replicas, detects crash failures, collates the messages generated by replicated objects, and delivers the result of each vote. We also present an adaptive majority voting algorithm that enables the correct ongoing vote while both the number of replicas and the majority size dynamically change. Together, these two algorithms form the basis of the mechanism for tolerating and recovering from value faults and crash failures in AQuA
Keywords
client-server systems; data integrity; distributed object management; fault tolerant computing; AQuA architecture; CORBA; active replication communication; adaptive algorithm; adaptive fault tolerance; adaptive majority voting algorithm; crash failures; data consistency; dependability; objects replication; value faults; Adaptive algorithm; Change detection algorithms; Computer crashes; Fault detection; Fault tolerance; Fault tolerant systems; Middleware; Object detection; Runtime; Voting;
fLanguage
English
Journal_Title
Parallel and Distributed Systems, IEEE Transactions on
Publisher
ieee
ISSN
1045-9219
Type
jour
DOI
10.1109/71.910872
Filename
910872
Link To Document