DocumentCode
289998
Title
Fault-tolerance on regular decomposition grid applications
Author
Silva, Luis Moura ; Silva, Joao Gabriel ; Chapple, Simon ; Clarke, Lyndon
Author_Institution
Dept. de Engenharia Informatica, Coimbra Univ., Portugal
fYear
1995
fDate
25-27 Jan 1995
Firstpage
358
Lastpage
365
Abstract
Writing parallel applications is considerably more complex due to additional problems not found in the sequential environment. The main problems include communication, synchronization data partitioning and distribution, mapping of processes, heterogeneity and fault tolerance. Fault tolerance is a very important feature in parallel/distributed systems since the mean time between failures of the system decreases with the number of processors, and the failure of just one process(or) can lead to the crash of the entire application. This paper presents an example of a parallel library (PUL-RD) that solves most of the problems pointed out before and provides support for fault tolerance. The original version of the library offers high-level support for parallelism in a portable way and can be used to write grid-based parallel applications which have a regular decomposition. In this paper, we will describe the fault-tolerance issues that were incorporated into the PUL-RD, giving special attention to the functionality of the checkpointing scheme
Keywords
fault tolerant computing; software fault tolerance; synchronisation; PUL-RD; checkpointing; communication; distribution; fault-tolerance; heterogeneity; high-level support; regular decomposition grid applications; synchronization data partitioning; Application software; Computer crashes; Fault tolerance; Libraries; Parallel processing; Parallel programming; Programming profession; Software reusability; Utility programs; Writing;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Processing, 1995. Proceedings. Euromicro Workshop on
Conference_Location
San Remo
Print_ISBN
0-8186-7031-2
Type
conf
DOI
10.1109/EMPDP.1995.389187
Filename
389187
Link To Document