• DocumentCode
    2938656
  • Title

    Enhancing replica management services to tolerate group failures

  • Author

    Ezhilchelvan, Paul D. ; Shrivastava, Santosh K.

  • Author_Institution
    Dept. of Comput. Sci., Newcastle upon Tyne Univ., UK
  • fYear
    1999
  • fDate
    1999
  • Firstpage
    263
  • Lastpage
    268
  • Abstract
    In a distributed system, replication of components, such as objects, is a well known way of achieving availability. For increased availability, crashed and disconnected components must be replaced by new components on available spare nodes. In this context, we address the problem of reconfiguring a group after the group as an entity has failed. Such a failure is termed a group failure which, for example, can be the crash of every component in the group or the group being partitioned into minority islands. The solution assumes crash-proof storage, and eventual recovery of crashed nodes and healing of partitions. It guarantees that: (i) the number of groups reconfigured after a group failure is never more than one, and (ii) the reconfigured group contains a majority of the components which were members just before the group failed, so that the loss of state information due to group failure is minimal. The protocol is efficient in terms of communication rounds and use of stable store, during both normal operations and reconfiguration after a group failure
  • Keywords
    configuration management; distributed processing; fault tolerant computing; object-oriented programming; system recovery; availability; communication rounds; crash-proof storage; crashed nodes; disconnected components; distributed system; group failure tolerance; group reconfiguration; normal operations; protocol; replica management services; replication; stable store; state information; Availability; Computer crashes; Distributed computing; Protocols; Resumes; Voting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Object-Oriented Real-Time Distributed Computing, 1999. (ISORC '99) Proceedings. 2nd IEEE International Symposium on
  • Conference_Location
    Saint-Malo
  • Print_ISBN
    0-7695-0207-5
  • Type

    conf

  • DOI
    10.1109/ISORC.1999.776388
  • Filename
    776388