DocumentCode :
1247788
Title :
Dynamic reconfiguration in computer clusters with irregular topologies in the presence of multiple node and link failures
Author :
Avresky, Dimiter ; Natchev, Natcho
Author_Institution :
Dept. of Electr. & Comput. Eng., Northeastern Univ., Boston, MA, USA
Volume :
54
Issue :
5
fYear :
2005
fDate :
5/1/2005 12:00:00 AM
Firstpage :
603
Lastpage :
615
Abstract :
Component failures in high-speed computer networks can result in significant topological changes. In such cases, a network reconfiguration algorithm must be executed to restore the connectivity between the network nodes. Most contemporary networks use either static reconfiguration algorithms or stop the user traffic in order to prevent cyclic dependencies in the routing tables. The goal is to present NetRec, a dynamic network reconfiguration algorithm for tolerating multiple node and link failures in high-speed networks with arbitrary topology. The algorithm updates the routing tables asynchronously and does not require any global knowledge about the network topology. Certain phases of NetRec are executed in parallel, which reduces the reconfiguration time. The algorithm suspends the application traffic in small regions of the network only while the routing tables are being updated. The message complexity of NetRec is analyzed and the termination, liveness, and safety of the proposed algorithm are proven. Additionally, results from validation of the algorithm in a distributed network-validation testbed Distant, based on the MPI 1.2 features for building arbitrary virtual topologies, are presented.
Keywords :
communication complexity; computer network management; computer network reliability; fault tolerant computing; telecommunication links; telecommunication network routing; telecommunication network topology; workstation clusters; Distant distributed network-validation testbed; NetRec dynamic network reconfiguration algorithm; component failures; cyclic dependency; fault tolerance; high-speed computer networks; link failures; message complexity; multiple node failure; network topology; routing tables; static reconfiguration algorithms; workstation cluster; Communication system routing; Complexity theory; Computer fault tolerance; Computer network management; Computer network reliability; Index Terms- Dynamic reconfiguration; clusters of workstations; fault tolerance; irregular topologies.; multiple node and link failures;
fLanguage :
English
Journal_Title :
Computers, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9340
Type :
jour
DOI :
10.1109/TC.2005.76
Filename :
1407849
Link To Document :
بازگشت