DocumentCode
2483263
Title
Tolerating client and communication failures in distributed groupware systems
Author
Shim, Hyong Sop ; Prakash, Atul
Author_Institution
Dept. of Electr. Eng. & Comput. Sci., Michigan Univ., Ann Arbor, MI, USA
fYear
1998
fDate
20-23 Oct 1998
Firstpage
221
Lastpage
227
Abstract
If a groupware system is to be effectively used, especially over a wide area network such as the Internet, where the quality of networking and computing resources are unpredictable, it should allow clients to tolerate client, link, and server failures. In particular, clients should be able to join groups and transfer groups´ current state in the presence of most client and link failures. In order to reduce usage overhead, disconnected clients should also be able to rejoin groups without having to restart from scratch. Furthermore, lock management and group membership should tolerate transient failures in the system. We introduce the notion of stateful group communication, which frees clients of administrative management of shared application state and allows fault tolerant group join, state transfer, and rejoin. Stateful group communication is incorporated in Corona, a general purpose, group communication service provider. In order to allow groups to tolerate transient failures, Corona also provides locks with grace period and group membership notification services that are based on client connection status. We present and discuss Corona´s fault tolerant services
Keywords
client-server systems; computer network reliability; fault tolerant computing; groupware; Corona; Internet; administrative management; client connection status; communication failures; distributed groupware systems; fault tolerance; fault tolerant group join; fault tolerant services; group communication service provider; group membership; group membership notification services; lock management; server failures; state transfer; stateful group communication; transient failures; usage overhead; wide area network; Collaborative software; Collaborative work; Computer crashes; Computer networks; Fault tolerant systems; Hip; IP networks; Network servers; Postal services; Web server;
fLanguage
English
Publisher
ieee
Conference_Titel
Reliable Distributed Systems, 1998. Proceedings. Seventeenth IEEE Symposium on
Conference_Location
West Lafayette, IN
ISSN
1060-9857
Print_ISBN
0-8186-9218-9
Type
conf
DOI
10.1109/RELDIS.1998.740501
Filename
740501
Link To Document