Title :
Design and analysis of software reconfiguration strategies for hypercube multicomputers under multiple faults
Author :
Peercy, M. ; Banerjee, P.
Author_Institution :
Center for Reliable & High-Performance Comput., Illinois Univ., Urbana, IL, USA
Abstract :
The authors discuss the design of a software reconfiguration strategy for hypercube multicomputer architectures under multiple faults. The advantage of the strategy over previous schemes is that it requires no redundant hardware, but supports reconfiguration through graceful degradation. It is based on the notion of using multiple virtual processors on a single physical processor and using these virtual processors for work-load redistribution under faults. The authors describe an environment, developed on a commercially available Intel iPSC/2 hypercube multicomputer, for implementing the software-based fault tolerance scheme. Results of experiments performed with this environment on the performance degradation of application programs under hardware faults are described. The reconfiguration scheme shows low overhead at low cost, and even provides improved efficiency on a fault-free hypercube.<>
Keywords :
fault tolerant computing; hypercube networks; multiprocessing programs; reconfigurable architectures; software reliability; virtual machines; Intel iPSC/2 hypercube multicomputer; graceful degradation; hardware faults; hypercube multicomputers; multiple faults; multiple virtual processors; software reconfiguration strategies; software-based fault tolerance scheme; Circuit faults; Computer architecture; Degradation; Fault diagnosis; Hardware; Hypercubes; Joining processes; Object oriented programming; Peer to peer computing; Software design;
Conference_Titel :
Fault-Tolerant Computing, 1992. FTCS-22. Digest of Papers., Twenty-Second International Symposium on
Conference_Location :
Boston, MA, USA
Print_ISBN :
0-8186-2875-8
DOI :
10.1109/FTCS.1992.243590