• DocumentCode
    1068065
  • Title

    A fault-tolerant communication scheme for hypercube computers

  • Author

    Lee, Tze Chiang ; Hayes, John P.

  • Author_Institution
    IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
  • Volume
    41
  • Issue
    10
  • fYear
    1992
  • fDate
    10/1/1992 12:00:00 AM
  • Firstpage
    1242
  • Lastpage
    1256
  • Abstract
    A fault-tolerant communication scheme that facilitates near-optimal routing and broadcasting in hypercube computers subject to node failures is described. The concept of an unsafe node is introduced to identify fault-free nodes that may cause communication difficulties. It is shown that by only using `feasible´ paths that try to avoid unsafe nodes, routing and broadcasting can be substantially simplified. A computationally efficient routing algorithm that uses local information is presented. It can route a message via a path of length no greater than p+2, where p is the minimum distance from the source to the destination, provided that not all nonfaulty nodes in the hypercube are unsafe. Broadcasting can be achieved under the same fault conditions with only one more time unit than the fault-free case. The problems posed by deadlock in faulty hypercubes are discussed, and deadlock-free implementations of the proposed communication schemes are presented
  • Keywords
    fault tolerant computing; hypercube networks; parallel architectures; broadcasting; computationally efficient routing algorithm; deadlock-free implementations; fault-free nodes; fault-tolerant communication scheme; hypercube computers; near-optimal routing; node failures; Broadcasting; Concurrent computing; Costs; Distributed computing; Fault diagnosis; Fault tolerance; Hardware; Hypercubes; Routing; System recovery;
  • fLanguage
    English
  • Journal_Title
    Computers, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9340
  • Type

    jour

  • DOI
    10.1109/12.166602
  • Filename
    166602