Title : 
Distributed, deadlock-free routing in faulty, pipelined, direct interconnection networks
         
        
            Author : 
Gaughan, Patrick T. ; Dao, Binh V. ; Yalamanchili, Sudhakar ; Schimmel, David E.
         
        
            Author_Institution : 
Dept. of Electr. Eng., Alabama Univ., Tuscaloosa, AL, USA
         
        
        
        
        
            fDate : 
6/1/1996 12:00:00 AM
         
        
        
        
            Abstract : 
This paper focuses on designing high performance pipelined networks that can operate in the presence of dynamic component failures. A general, rigorous framework for deadlock-free communication in faulty, pipelined networks is developed. A mechanism is also proposed for recovering from dynamic link and node failures. The recovery mechanism (1) is fully distributed, (2) does not require timeouts, (3) prevents fault-induced deadlock, and (4) is integrated into the virtual channel flow control mechanisms. This recovery mechanism is used to develop a new pipelined communication mechanism-acknowledged pipelined circuit-switching (APCS). This mechanism supports existing routing protocols that can tolerate a maximal number of static link failures, i.e., one less than the number of ports on a node. An implementation of a novel router architecture is described and the results of detailed flit level simulations are presented. Finally, the proposed recovery mechanism is shown to be applicable to existing adaptive wormhole routing protocols which are prone to deadlock in the presence of dynamic faults
         
        
            Keywords : 
fault tolerant computing; multiprocessor interconnection networks; network routing; protocols; acknowledged pipelined circuit-switching; adaptive wormhole routing protocols; deadlock-free communication; deadlock-free routing; direct interconnection networks; distributed routing; dynamic component failures; dynamic link; fault-induced deadlock; high performance pipelined networks; node failures; pipelined communication mechanism; recovery mechanism; router architecture; routing protocols; static link failures; virtual channel flow control mechanisms; Circuit faults; Fault tolerance; Intelligent networks; Laboratories; Multiprocessor interconnection networks; Personal communication networks; Routing protocols; Senior members; Student members; System recovery;
         
        
        
            Journal_Title : 
Computers, IEEE Transactions on