مرکز منطقه ای اطلاع رساني علوم و فناوري - A fault-tolerant distributed subcube management scheme for hypercube multicomputer systems

DocumentCode :

806829

Title :

A fault-tolerant distributed subcube management scheme for hypercube multicomputer systems

Author :

Chen, Yi-long ; Liu, Jyh-Cham

Author_Institution :

Dept. of Comput. Sci., Texas A&M Univ., College Station, TX, USA

Volume :

Issue :

fYear :

1995

fDate :

7/1/1995 12:00:00 AM

Firstpage :

766

Lastpage :

772

Abstract :

This paper proposes a fault-tolerant distributed subcube management scheme for hypercube multicomputer systems. Gracefully degradable subcube management is supported by a data structure, called the distributed subcube table (DST), and a fault-tolerant broadcast protocol, called the reliably synchronized broadcast (RSB). In an n-dimensional hypercube, DST is the collection of 2ⁿ local subcube tables (LSTs), DST={LST₀, LST, ..., LST_2-1 ⁿ}, where LST, is a bit-mapped table assigned to N_x, a fault-free node whose address is x. LST_x, ∀_x, is n+1 bits long, and it records the status (free/busy) of certain subcubes adjacent to N_x. The RSB diagnoses and avoids faults during interprocessor communication to prevent faulty nodes from being allocated for job execution. In addition to possessing a fault-tolerant design, our scheme can also achieve comparable or better performance than existing centralized schemes, as verified by extensive simulation

Keywords :

data structures; distributed databases; hypercube networks; performance evaluation; software fault tolerance; bit-mapped table; data structure; distributed subcube table; fault-tolerant broadcast protocol; fault-tolerant distributed subcube management scheme; gracefully degradable subcube management; hypercube multicomputer systems; interprocessor communication; reliably synchronized broadcast; Broadcasting; Centralized control; Computational modeling; Data structures; Degradation; Fault tolerance; Fault tolerant systems; Hypercubes; Time factors; Tree graphs;

fLanguage :

English

Journal_Title :

Parallel and Distributed Systems, IEEE Transactions on

Publisher :

ieee

ISSN :

1045-9219

Type :

jour

DOI :

10.1109/71.395406

Filename :

395406

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=806829