Title :
A theory of fault-tolerant routing in wormhole networks
Author_Institution :
Fac. de Inf., Univ. Politecnica de Valencia, Spain
fDate :
8/1/1997 12:00:00 AM
Abstract :
Fault-tolerant systems aim at providing continuous operation in the presence of faults. Multicomputers rely on an interconnection network between processors to support the message-passing mechanism. Therefore, the reliability of the interconnection network is very important for the reliability of the whole system. This paper analyzes the effective redundancy available in a wormhole network by combining connectivity and deadlock freedom. Redundancy is defined at the channel level. We propose a sufficient condition for channel redundancy, also computing the set of redundant channels. The redundancy level of the network is also defined, proposing a theorem that supplies its value. This theory is developed on top of our necessary and sufficient condition for deadlock-free adaptive routing. The new theory also considers the failure of physical channels when virtual channels are used. Finally, we propose a methodology for the design of fault-tolerant routing algorithms, showing its application to n-dimensional meshes
Keywords :
fault tolerant computing; message passing; multiprocessor interconnection networks; telecommunication network routing; connectivity; deadlock freedom; deadlock-free adaptive routing; fault-tolerant routing; interconnection network; message-passing mechanism; multicomputers; necessary and sufficient condition; redundancy; virtual channels; wormhole networks; Algorithm design and analysis; Fault tolerance; Fault tolerant systems; Hypercubes; Intelligent networks; Multiprocessor interconnection networks; Network topology; Redundancy; Routing; System recovery;
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on