• DocumentCode
    2721043
  • Title

    Non-blocking adaptive cycles: Deadlock avoidance for fault-tolerant interconnection networks

  • Author

    Zarza, Gonzalo ; Lugones, Diego ; Franco, Daniel ; Luque, Emilio

  • Author_Institution
    Comput. Archit. & Oper. Syst. Dept., Univ. Autonoma de Barcelona, Barcelona, Spain
  • fYear
    2010
  • fDate
    20-24 Sept. 2010
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    The interconnection network communicates and links together the processing units of modern high-performance computing systems. In this context, network faults have an extremely high impact since most routing algorithms were not designed to tolerate faults. Because of this, just a single fault may stall messages in the network, preventing the finalization of applications, or may lead to deadlocked configurations. In this paper we introduce a scalable deadlock avoidance technique specifically designed to deal with large interconnection networks suffering from a large number of dynamic faults. Our method is based on adding one-slot deadlock avoidance buffers and does not require the use of any virtual channels. Additionally, fully-adaptive routing algorithms may be designed on the basis of our proposal.
  • Keywords
    fault tolerant computing; multiprocessor interconnection networks; fault-tolerant interconnection network; fully-adaptive routing algorithm; high-performance computing system; nonblocking adaptive cycles; scalable deadlock avoidance technique; Fault tolerance; Fault tolerant systems; Heuristic algorithms; Multiprocessor interconnection; Routing; Routing protocols; System recovery; Interconnection networks; adaptive routing; deadlock avoidance; fault tolerance;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS), 2010 IEEE International Conference on
  • Conference_Location
    Heraklion, Crete
  • Print_ISBN
    978-1-4244-8395-2
  • Electronic_ISBN
    978-1-4244-8397-6
  • Type

    conf

  • DOI
    10.1109/CLUSTERWKSP.2010.5613085
  • Filename
    5613085