• DocumentCode
    289998
  • Title

    Fault-tolerance on regular decomposition grid applications

  • Author

    Silva, Luis Moura ; Silva, Joao Gabriel ; Chapple, Simon ; Clarke, Lyndon

  • Author_Institution
    Dept. de Engenharia Informatica, Coimbra Univ., Portugal
  • fYear
    1995
  • fDate
    25-27 Jan 1995
  • Firstpage
    358
  • Lastpage
    365
  • Abstract
    Writing parallel applications is considerably more complex due to additional problems not found in the sequential environment. The main problems include communication, synchronization data partitioning and distribution, mapping of processes, heterogeneity and fault tolerance. Fault tolerance is a very important feature in parallel/distributed systems since the mean time between failures of the system decreases with the number of processors, and the failure of just one process(or) can lead to the crash of the entire application. This paper presents an example of a parallel library (PUL-RD) that solves most of the problems pointed out before and provides support for fault tolerance. The original version of the library offers high-level support for parallelism in a portable way and can be used to write grid-based parallel applications which have a regular decomposition. In this paper, we will describe the fault-tolerance issues that were incorporated into the PUL-RD, giving special attention to the functionality of the checkpointing scheme
  • Keywords
    fault tolerant computing; software fault tolerance; synchronisation; PUL-RD; checkpointing; communication; distribution; fault-tolerance; heterogeneity; high-level support; regular decomposition grid applications; synchronization data partitioning; Application software; Computer crashes; Fault tolerance; Libraries; Parallel processing; Parallel programming; Programming profession; Software reusability; Utility programs; Writing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing, 1995. Proceedings. Euromicro Workshop on
  • Conference_Location
    San Remo
  • Print_ISBN
    0-8186-7031-2
  • Type

    conf

  • DOI
    10.1109/EMPDP.1995.389187
  • Filename
    389187