DocumentCode :
2933039
Title :
Recovering from Distributable Thread Failures with Assured Timeliness in Real-Time Distributed Systems
Author :
Curley, Edward ; Anderson, Jonathan ; Ravindran, Binoy ; Jensen, E.D.
Author_Institution :
ECE Dept., Virginia Tech, Blacksburg, VA
fYear :
2006
fDate :
2-4 Oct. 2006
Firstpage :
267
Lastpage :
276
Abstract :
We consider the problem of recovering from failures of distributable threads with assured timeliness. When a node hosting a portion of a distributable thread fails, it causes orphans - i.e., thread segments that are disconnected from the thread´s root. We consider a termination model for recovering from such failures, where the orphans must be detected and aborted, and failure-exception notification must be delivered to the farthest, contiguous surviving thread segment for resuming thread execution. We present a realtime scheduling algorithm called AUA, and a distributable thread integrity protocol called TP-TR. We show that AUA and TP-TR bound the orphan cleanup and recovery time, thereby bounding thread starvation durations, and maximize the total thread accrued timeliness utility. We implement AUA and TP-TR in a real-time middleware that supports distributable threads. Our experimental studies with the implementation validate the algorithm/protocol´s time-bounded recovery property and confirm their effectiveness
Keywords :
distributed processing; software fault tolerance; system recovery; distributable thread failures; distributable thread integrity protocol; failure-exception notification; failures recovery; real time scheduling; real-time distributed systems; termination model; Concurrent computing; Middleware; Phased arrays; Protocols; Real time systems; Resource management; Scheduling algorithm; Sun; Time factors; Yarn;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Reliable Distributed Systems, 2006. SRDS '06. 25th IEEE Symposium on
Conference_Location :
Leeds
ISSN :
1060-9857
Print_ISBN :
0-7695-2677-2
Type :
conf
DOI :
10.1109/SRDS.2006.38
Filename :
4032488
Link To Document :
بازگشت