• DocumentCode
    388766
  • Title

    Efficiently tolerating failures in asynchronous real-time distributed systems

  • Author

    Li, Peng ; Ravindran, Binoy

  • Author_Institution
    Real-Time Syst. Lab., Virginia Tech, Blacksburg, VA, USA
  • fYear
    2002
  • fDate
    2002
  • Firstpage
    19
  • Lastpage
    26
  • Abstract
    We present a proactive resource allocation algorithm, called BEA, for fault-tolerant asynchronous real-time distributed systems. BEA considers an application model where transnode application timeliness requirements are expressed using benefit functions, and anticipated workload during future time intervals are expressed using adaptation functions. Furthermore, BEA considers an adaptation model where subtasks of application tasks are replicated at run-time for tolerating failures as well as for sharing workload increases. Given such models, the objective of the algorithm is to maximize the aggregate real-time benefit and the ability to tolerate host failures during the time window of adaptation functions. Since determining the optimal solution is computationally intractable, BEA heuristically computes near-optimal resource allocations in polynomial-time. We show that BEA can achieve almost the same fault-tolerance ability as full replication, and accrue most of real-time benefit that full replication can accrue. In the meanwhile, BEA requires much fewer replicas than full replication, and hence is cost effective.
  • Keywords
    computational complexity; distributed algorithms; fault tolerant computing; optimisation; real-time systems; resource allocation; BEA; adaptation functions; anticipated workload; asynchronous real-time systems; benefit functions; fault-tolerant distributed systems; heuristic computation; near-optimal resource allocations; polynomial time; proactive resource allocation algorithm; real-time benefit maximization; subtask replication; transnode application timeliness; Adaptation model; Aggregates; Application software; Costs; Fault tolerance; Polynomials; Quality of service; Real time systems; Resource management; Runtime;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Assurance Systems Engineering, 2002. Proceedings. 7th IEEE International Symposium on
  • ISSN
    1530-2059
  • Print_ISBN
    0-7695-1769-2
  • Type

    conf

  • DOI
    10.1109/HASE.2002.1173096
  • Filename
    1173096