DocumentCode :
1149375
Title :
Load Redistribution Under Failure in Distributed Systems
Author :
Chou, Timothy C.K. ; Abraham, Jacob A.
Author_Institution :
Tandem Computers
Issue :
9
fYear :
1983
Firstpage :
799
Lastpage :
808
Abstract :
In order to implement a distributed system with fail-soft capabilities it is necessary to specify algorithms which redistribute the work load of a failed processor to the remaining good processors. This paper develops a general model to analyze the behavior of these algorithms in a distributed system. Such algorithms should be used with caution as they have the capability of making the entire system Unstable. By unstable we mean that if a processor fails, and its workload is redistributed, then the increased workload directed towards the rest of the system could drive one or more of the processors into overload resulting in a serious degradation of system performance. Using the general model we have studied a class of load redistribution algorithms which use various techniques to redistribute workload. These techniques include: buffering jobs arriving to the failed processor, transmitting only the jobs in the queue of the failed processor, and rerouting all jobs around the failed processor. For this class of algorithms we have derived closed form expressions for the performance of the system as a function of job arrival rate, job service rate, processor failure rate, and processor service rate. In addition, we have defined a criterion which, if adhered to, will guarantee system stability in the event of failure.
Keywords :
Distributed systems; computer systems modeling high-availability systems; fault-tolerant computing; load redistribution; Algorithm design and analysis; Availability; Degradation; Fault tolerant systems; Hardware; Jacobian matrices; Modeling; Performance evaluation; Stability criteria; System performance; Distributed systems; computer systems modeling high-availability systems; fault-tolerant computing; load redistribution;
fLanguage :
English
Journal_Title :
Computers, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9340
Type :
jour
DOI :
10.1109/TC.1983.1676329
Filename :
1676329
Link To Document :
بازگشت