• DocumentCode
    2483263
  • Title

    Tolerating client and communication failures in distributed groupware systems

  • Author

    Shim, Hyong Sop ; Prakash, Atul

  • Author_Institution
    Dept. of Electr. Eng. & Comput. Sci., Michigan Univ., Ann Arbor, MI, USA
  • fYear
    1998
  • fDate
    20-23 Oct 1998
  • Firstpage
    221
  • Lastpage
    227
  • Abstract
    If a groupware system is to be effectively used, especially over a wide area network such as the Internet, where the quality of networking and computing resources are unpredictable, it should allow clients to tolerate client, link, and server failures. In particular, clients should be able to join groups and transfer groups´ current state in the presence of most client and link failures. In order to reduce usage overhead, disconnected clients should also be able to rejoin groups without having to restart from scratch. Furthermore, lock management and group membership should tolerate transient failures in the system. We introduce the notion of stateful group communication, which frees clients of administrative management of shared application state and allows fault tolerant group join, state transfer, and rejoin. Stateful group communication is incorporated in Corona, a general purpose, group communication service provider. In order to allow groups to tolerate transient failures, Corona also provides locks with grace period and group membership notification services that are based on client connection status. We present and discuss Corona´s fault tolerant services
  • Keywords
    client-server systems; computer network reliability; fault tolerant computing; groupware; Corona; Internet; administrative management; client connection status; communication failures; distributed groupware systems; fault tolerance; fault tolerant group join; fault tolerant services; group communication service provider; group membership; group membership notification services; lock management; server failures; state transfer; stateful group communication; transient failures; usage overhead; wide area network; Collaborative software; Collaborative work; Computer crashes; Computer networks; Fault tolerant systems; Hip; IP networks; Network servers; Postal services; Web server;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Reliable Distributed Systems, 1998. Proceedings. Seventeenth IEEE Symposium on
  • Conference_Location
    West Lafayette, IN
  • ISSN
    1060-9857
  • Print_ISBN
    0-8186-9218-9
  • Type

    conf

  • DOI
    10.1109/RELDIS.1998.740501
  • Filename
    740501