• DocumentCode
    3173197
  • Title

    A fault-tolerant algorithm for decentralized on-line quorum adaptation

  • Author

    Bearden, M. ; Bianchini, R.P., Jr.

  • Author_Institution
    Lucent Technol., AT&T Bell Labs., Murray Hill, NJ, USA
  • fYear
    1998
  • fDate
    23-25 June 1998
  • Firstpage
    262
  • Lastpage
    271
  • Abstract
    A quorum-based distributed mutual exclusion protocol requires each processor in a distributed system to obtain permission from a quorum of processors before accessing a resource that cannot be concurrently shared. To prevent failed quorum members from blocking access to the resource, it is desirable to remove failed processors from quorums when failures are detected. This work addresses the problem of adapting quorums on-line, while a quorum-based mutual exclusion protocol continues to operate. To preserve the quorum intersection property that is required for mutual exclusion safety, it is necessary to coordinate changes made to the quorum data structures of different processors. A solution is given in the form of QADAPT, a decentralized algorithm that guarantees safe adaptation of quorums when processors fail. QADAPT enables any set of quorum adaptations that do not violate the quorum intersection property, and enables any set of faulty processors to be removed from quorums. QADAPT has optimal message passing cost and tolerates any number of processor (halting) failures. A distributed system model is assumed that provides only point-to-point messages with no message ordering. Results from an implementation show that the algorithm´s execution time scales well in a system containing up to fifty networked workstations. Extensions of this work include on-line adaptation of quorums that are used to maintain replica consistency in distributed databases.
  • Keywords
    concurrency control; data structures; distributed algorithms; fault tolerant computing; message passing; QADAPT; decentralized algorithm; decentralized online quorum adaptation; distributed databases; distributed mutual exclusion protocol; distributed system; fault-tolerant algorithm; faulty processors; networked workstations; optimal message passing cost; point-to-point messages; quorum data structures; quorum intersection property; resource access; Access protocols; Concurrent computing; Cost function; Distributed computing; Fault tolerance; Message passing; Permission; Safety; Vents; Workstations;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fault-Tolerant Computing, 1998. Digest of Papers. Twenty-Eighth Annual International Symposium on
  • Conference_Location
    Munich, Germany
  • ISSN
    0731-3071
  • Print_ISBN
    0-8186-8470-4
  • Type

    conf

  • DOI
    10.1109/FTCS.1998.689477
  • Filename
    689477