DocumentCode :
2721043
Title :
Non-blocking adaptive cycles: Deadlock avoidance for fault-tolerant interconnection networks
Author :
Zarza, Gonzalo ; Lugones, Diego ; Franco, Daniel ; Luque, Emilio
Author_Institution :
Comput. Archit. & Oper. Syst. Dept., Univ. Autonoma de Barcelona, Barcelona, Spain
fYear :
2010
fDate :
20-24 Sept. 2010
Firstpage :
1
Lastpage :
4
Abstract :
The interconnection network communicates and links together the processing units of modern high-performance computing systems. In this context, network faults have an extremely high impact since most routing algorithms were not designed to tolerate faults. Because of this, just a single fault may stall messages in the network, preventing the finalization of applications, or may lead to deadlocked configurations. In this paper we introduce a scalable deadlock avoidance technique specifically designed to deal with large interconnection networks suffering from a large number of dynamic faults. Our method is based on adding one-slot deadlock avoidance buffers and does not require the use of any virtual channels. Additionally, fully-adaptive routing algorithms may be designed on the basis of our proposal.
Keywords :
fault tolerant computing; multiprocessor interconnection networks; fault-tolerant interconnection network; fully-adaptive routing algorithm; high-performance computing system; nonblocking adaptive cycles; scalable deadlock avoidance technique; Fault tolerance; Fault tolerant systems; Heuristic algorithms; Multiprocessor interconnection; Routing; Routing protocols; System recovery; Interconnection networks; adaptive routing; deadlock avoidance; fault tolerance;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS), 2010 IEEE International Conference on
Conference_Location :
Heraklion, Crete
Print_ISBN :
978-1-4244-8395-2
Electronic_ISBN :
978-1-4244-8397-6
Type :
conf
DOI :
10.1109/CLUSTERWKSP.2010.5613085
Filename :
5613085
Link To Document :
بازگشت