Author_Institution :
Dept. of Comput. Sci. & Eng., Florida Atlantic Univ., Boca Raton, FL, USA
Abstract :
The rectangular faulty block model is the most commonly used fault model for designing fault-tolerant, and deadlock-free routing algorithms in mesh-connected multicomputers. The convexity of a rectangle facilitates simple, efficient ways to route messages around fault regions using relatively few or no virtual channels to avoid deadlock. However, such a faulty block may include many nonfaulty nodes which are disabled, i.e., they are not involved in the routing process. Therefore, it is important to define a fault region that is convex, and at the same time, to include a minimum number of nonfaulty nodes. In this paper, we propose an optimal solution that can quickly construct a set of minimum faulty polygons, called orthogonal convex polygons, from a given set of faulty blocks in a 2-D mesh (or 2-D torus). The formation of orthogonal convex polygons is implemented using either a centralized, or distributed solution. Both solutions are based on the formation of faulty components, each of which consists of adjacent faulty nodes only, followed by the addition of a minimum number of nonfaulty nodes to make each component a convex polygon. Extensive simulation has been done to determine the number of nonfaulty nodes included in the polygon, and the result obtained is compared with the best existing known result. Results show that the proposed approach can not only find a set of minimum faulty polygons, but also does so quickly in terms of the number of rounds in the distributed solution.
Keywords :
fault tolerant computing; network routing; deadlock-free routing algorithm; fault-tolerant routing; mesh-connected multicomputer; orthogonal convex polygon; virtual channel; Algorithm design and analysis; Buildings; Computer science; Costs; Fault tolerance; Hypercubes; Mesh networks; Routing; System recovery; Tin; 2-dimensional meshes (tori); Fault model; fault tolerance; orthogonal convex polygons; routing;