Title :
Dynamic fault reconfiguration in a mesh-connected MIMD environment
Author :
Uyar, M. Ümit ; Reeves, Anthony P.
Author_Institution :
AT&T Bell Lab., Holmdel, NJ, USA
fDate :
10/1/1988 12:00:00 AM
Abstract :
The near-neighbor problem is characterized by many iterations of a parallel matrix operation in which each matrix element is recomputed as a function of itself and its immediately adjacent near neighbors. Several highly parallel computer systems have been designed with the near-neighbor class of problems as the target application. As the number of processors in evolving parallel computer systems increases, the capability of fault tolerance to processor failures becomes more important. The authors show how fault tolerance can be efficiently achieved on an MIMD (multiple-instruction, multiple-data-stream) computer system for the near-neighbor problem by task redistribution. The techniques discussed minimize the extra data transfers and/or the extra computation in the system with faulty processors and links
Keywords :
fault tolerant computing; parallel processing; data transfers; dynamic default reconfiguration; fault tolerance; highly parallel computer systems; mesh-connected MIMD environment; near-neighbor problem; parallel matrix operation; processor failures; Application software; Concurrent computing; Costs; Fault tolerance; Fault tolerant systems; Hardware; Image processing; Large-scale systems; Parallel processing; Partial differential equations;
Journal_Title :
Computers, IEEE Transactions on