• DocumentCode
    3311976
  • Title

    Fault-tolerant task management and load re-distribution on massively parallel hypercube systems

  • Author

    Ahmad, Ishfaq ; Ghafoor, Arif

  • Author_Institution
    Sch. of Comput. & Inf. Sci., Syracuse Univ., NY, USA
  • fYear
    1992
  • fDate
    16-20 Nov 1992
  • Firstpage
    750
  • Lastpage
    759
  • Abstract
    The authors present a scheme for managing real-time task allocation and load redistribution with fault-tolerance for hypercube systems. A set of processors, called fault-control processors (FCPs), can be used for keeping the duplicate copies of tasks and real locating tasks if the original processors of those tasks fail. Two-level task redundancy is used by grouping the FCPs as primary and secondary for each processor. The proposed scheme provides a high degree of fault-tolerance since each FCP itself is monitored by other FCPs. Assuming a failure-repair system environment, the performance of the proposed strategy has been evaluated and compared with a fault-free environment for 256-node and 512-node hypercubes, through simulation experiments. The authors also introduce a measure of goodness, success probability, which represents the probability of reallocated tasks meeting their deadlines despite the failures of processors. It is shown that, using the proposed scheme, a large percentage of the rescheduled tasks can still meet their deadlines. The probability of a task being lost altogether, due to multiple failures, has been shown to be extremely low
  • Keywords
    fault tolerant computing; hypercube networks; parallel architectures; real-time systems; resource allocation; failure-repair system; fault-control processors; fault-tolerance; hypercube systems; load redistribution; real-time task allocation; success probability; task redundancy; Concurrent computing; Dynamic scheduling; Engineering management; Fault tolerance; Fault tolerant systems; Hypercubes; Large-scale systems; Load management; Real time systems; Timing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Supercomputing '92., Proceedings
  • Conference_Location
    Minneapolis, MN
  • Print_ISBN
    0-8186-2630-5
  • Type

    conf

  • DOI
    10.1109/SUPERC.1992.236688
  • Filename
    236688