Title :
A theory of fault-tolerant routing in wormhole networks
Author_Institution :
Fac. de Inf., Univ. Politecnica de Valencia, Spain
Abstract :
Fault-tolerant systems aim at providing continuous operations in the presence of faults. Multicomputers rely on an interconnection network between processors to support the message-passing mechanism. Therefore, the reliability of the interconnection network is very important for the reliability of the whole system. This paper analyzes the effective redundancy available in a wormhole network by combining connectivity and deadlock freedom. Redundancy is defined at the channel level. We propose a sufficient condition for channel redundancy, also computing the set of redundant channels. The redundancy level of the network is also defined, proposing a theorem that supplies its value. This theory is developed on top of our necessary and sufficient condition for deadlock-free adaptive routing. Finally, a fault-tolerant routing algorithm for n-dimensional meshes is proposed
Keywords :
concurrency control; fault tolerant computing; message passing; multiprocessor interconnection networks; network routing; parallel algorithms; reliability; channel level; channel redundancy; connectivity; continuous operations; deadlock; deadlock-free adaptive routing; fault-tolerant routing; fault-tolerant routing algorithm; fault-tolerant systems; interconnection network; interconnection network reliability; message-passing; multicomputers; n-dimensional meshes; redundancy; wormhole networks; Algorithm design and analysis; Computer network reliability; Fault tolerance; Fault tolerant systems; Intelligent networks; Multiprocessor interconnection networks; Redundancy; Routing; System recovery; Telecommunication network reliability;
Conference_Titel :
Parallel and Distributed Systems, 1994. International Conference on
Conference_Location :
Hsinchu
Print_ISBN :
0-8186-6555-6
DOI :
10.1109/ICPADS.1994.590404