• DocumentCode
    1940981
  • Title

    Dynamic snooping in a fault-tolerant distributed shared memory

  • Author

    Brown, Lmy ; Wu, Jie

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Florida Atlantic Univ., Boca Raton, FL, USA
  • fYear
    1994
  • fDate
    21-24 Jun 1994
  • Firstpage
    218
  • Lastpage
    226
  • Abstract
    Distributed shared memory (DSM) allows multicomputer systems with no physically shared memory to be programmed using a shared memory paradigm. However, as the number of nodes in a system increases the probability of a failure that can corrupt the DSM increases. This paper presents a fault-tolerant DSM (FTDSM) algorithm that can tolerate single node failures. Each page in the DSM is assigned a snooper that keeps a backup copy of the page and can take over if the page owner fails. The snooper is dynamic because the responsibility for snooping a page can migrate front node to node. The FTDSM presented is an improvement over other FTDSMs because it is scalable, is based on the efficient dynamic distributed manager (DDM) DSM algorithm, does not require the repair of a failed processor to access the DSM, and does not query all nodes to rebuild the state of the DSM. It is shown that any single node failure can be tolerated because either the owner or the snooper of a page can always be found
  • Keywords
    distributed algorithms; distributed memory systems; fault tolerant computing; multiprocessing programs; reliability; shared memory systems; software reliability; transaction processing; dynamic distributed manager; dynamic snooping; fault-tolerant DSM; fault-tolerant distributed shared memory; multicomputer systems; shared memory paradigm; single node failures; Access protocols; Computer science; Control systems; Delay; Distributed computing; Distributed decision making; Fault tolerance; Multiprocessor interconnection networks; Physics computing; Programming profession;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Distributed Computing Systems, 1994., Proceedings of the 14th International Conference on
  • Conference_Location
    Pozman
  • Print_ISBN
    0-8186-5840-1
  • Type

    conf

  • DOI
    10.1109/ICDCS.1994.302415
  • Filename
    302415