• DocumentCode
    1236770
  • Title

    Immunet: Dependable Routing for Interconnection Networks with Arbitrary Topology

  • Author

    Puente, Valentin ; Gregorio, José Angel ; Vallejo, Fernando ; Beivide, Ramón

  • Author_Institution
    Cantabria Univ., Santander
  • Volume
    57
  • Issue
    12
  • fYear
    2008
  • Firstpage
    1676
  • Lastpage
    1689
  • Abstract
    A complete mechanism for tolerating multiple failures in parallel computer systems, denoted as Immunet, is described in this paper. Immunet can be applied to arbitrary topologies, either regular or irregular, exhibiting in both cases graceful performance degradation. Provided that the network remains connected, Immunet is able to deal with any number of failures regardless of their spatial and temporal distribution. Our mechanism operates on the basis of a dynamic network reconfiguration in response to failures. The network reconfiguration only employs local information recorded at the router nodes which leads to a highly scalable system. In addition, its low cost and overhead permit a practicable hardware implementation. Finaly, Immunet could allow circumvent failures transparently to applications running on a parallel system because it does not require dropping in-flight traffic. Only packets stored in or traveling through a broken component should be recovered by higher system levels.
  • Keywords
    fault tolerant computing; multiprocessor interconnection networks; network routing; network topology; Immunet; arbitrary topology; computer systems; dependable routing; dynamic network reconfiguration; failure tolerance; inflight traffic; interconnection networks; parallel system; performance degradation; spatial-temporal distribution; Computer Society; Concurrent computing; Costs; Degradation; Fault tolerant systems; Hardware; Multiprocessor interconnection networks; Network topology; Routing; System recovery; Interconnection architectures; Parallel Architectures; Support for reliability;
  • fLanguage
    English
  • Journal_Title
    Computers, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9340
  • Type

    jour

  • DOI
    10.1109/TC.2008.95
  • Filename
    4531734