DocumentCode :
2436435
Title :
Performance evaluation of fault tolerance for parallel applications in networked environments
Author :
Sens, Pierre ; Folliot, Bertil
Author_Institution :
Paris VI Univ., France
fYear :
1997
fDate :
11-15 Aug 1997
Firstpage :
334
Lastpage :
341
Abstract :
This paper presents the performance evaluation of a software fault manager for distributed applications. Dubbed STAR, it uses the natural redundancy existing in networks of workstations to offer a high level of fault tolerance. Fault management is transparent to the supported parallel applications. STAR is application independent, highly configurable and easily portable to UNIX-like operating systems. The current implementation is based on independent checkpointing and message logging. Measurements show the efficiency and the limits of this implementation. The challenge is to show that a software approach to fault tolerance can efficiently be implemented in a standard networked environment
Keywords :
distributed processing; fault tolerant computing; performance evaluation; system recovery; STAR; fault management; fault tolerance; independent checkpointing; message logging; networked environments; parallel applications; performance evaluation; software fault manager; Application software; Buffer storage; Checkpointing; Environmental management; Fault tolerance; Hardware; Intelligent networks; Operating systems; Redundancy; Software performance;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Processing, 1997., Proceedings of the 1997 International Conference on
Conference_Location :
Bloomington, IL
ISSN :
0190-3918
Print_ISBN :
0-8186-8108-X
Type :
conf
DOI :
10.1109/ICPP.1997.622663
Filename :
622663
Link To Document :
بازگشت