• DocumentCode
    2724418
  • Title

    Intelligent Selection of Fault Tolerance Techniques on the Grid

  • Author

    Vanderster, Daniel C. ; Dimopoulos, Nikitas J. ; Sobie, Randall J.

  • Author_Institution
    Univ. of Victoria, Victoria
  • fYear
    2007
  • fDate
    10-13 Dec. 2007
  • Firstpage
    69
  • Lastpage
    76
  • Abstract
    The emergence of computational grids has lead to an increased reliance on task schedulers that can guarantee the completion of tasks that are executed on unreliable systems. There are three common techniques for providing task-level fault tolerance on a grid: retrying, replicating, and checkpointing. While these techniques are varyingly successful at providing resilience to faults, each of them presents a tradeoff between performance and resource cost. As such, tasks having unique urgency requirements would ideally be placed using one of the techniques; for example, urgent tasks are likely to prefer the replication technique, which guarantees timely completion, whereas low priority tasks should not incur any extra resource cost in the name of fault tolerance. This paper introduces a placement and selection strategy which, by computing the utility of each fault tolerance technique in relation to a given task, finds the set of allocation options which optimizes the global utility. Heuristics which take into account the value offered by a user, the estimated resource cost, and the estimated response time of an option are presented. Simulation results show that the resulting allocations have improved fault tolerance, runtime, profit, and allow users to prioritize their tasks.
  • Keywords
    grid computing; software fault tolerance; computational grids; intelligent selection; task-level fault tolerance; Checkpointing; Computational intelligence; Computational modeling; Costs; Delay; Fault tolerance; Grid computing; Processor scheduling; Resilience; Runtime;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    e-Science and Grid Computing, IEEE International Conference on
  • Conference_Location
    Bangalore
  • Print_ISBN
    978-0-7695-3064-2
  • Type

    conf

  • DOI
    10.1109/E-SCIENCE.2007.45
  • Filename
    4426873