• DocumentCode
    1362699
  • Title

    Adaptive fault-tolerant routing in cube-based multicomputers using safety vectors

  • Author

    Wu, Jie

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Florida Atlantic Univ., Boca Raton, FL, USA
  • Volume
    9
  • Issue
    4
  • fYear
    1998
  • fDate
    4/1/1998 12:00:00 AM
  • Firstpage
    321
  • Lastpage
    334
  • Abstract
    Reliable communication in cube-based multicomputers using the safety vector concept is studied in this paper. In our approach, each node in a cube-based multicomputer of dimension n is associated with a safety vector of n bits, which is an approximated measure of the number and distribution of faults in the neighborhood. The safety vector of each node can be easily calculated through n-1 rounds of information exchange among neighboring nodes. Optimal unicasting between two nodes is guaranteed if the kth bit of the safety vector of the source node is one, where k is the Hamming distance between the source and destination nodes. The concept of dynamic adaptivity is introduced, representing the ability of a routing algorithm to dynamically adjust its routing adaptivity based on fault distribution in the neighborhood. The feasibility of the proposed unicasting can be easily determined at the source node by comparing its safety vector with the Hamming distance between the source and destination nodes. The proposed unicasting can also be used in disconnected hypercubes, where nodes in a hypercube are disjointed (into two or more parts). We then extend the safety vector concept to general cube-based multicomputers
  • Keywords
    Hamming codes; fault tolerant computing; hypercube networks; Hamming distance; adaptive fault-tolerant routing; cube-based multicomputers; disconnected hypercubes; dynamic adaptivity; optimal unicasting; safety vector concept; safety vectors; unicasting; Fault tolerance; Hamming distance; Heuristic algorithms; Hypercubes; Message passing; Routing; Safety; Telecommunication network reliability; Topology; Unicast;
  • fLanguage
    English
  • Journal_Title
    Parallel and Distributed Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9219
  • Type

    jour

  • DOI
    10.1109/71.667894
  • Filename
    667894