Title :
Reliable distributed sorting through the application-oriented fault tolerance paradigm
Author :
McMillin, Bruce M. ; Ni, Lionel M.
Author_Institution :
Dept. of Comput. Sci., Missouri-Rolla Univ., MO, USA
fDate :
7/1/1992 12:00:00 AM
Abstract :
A fault-tolerant parallel sorting algorithm developed using the application-oriented fault tolerance paradigm is presented. The algorithm is tolerant of one processor/link failure in an n-cube. The addition of reliability to the sorting algorithm results in a performance penalty. Asymptotically, the fault-tolerant algorithm is less costly than host sorting. Experimentally it is shown that fault-tolerant sorting quickly becomes more efficient that host sorting when the bitonic sort/merge is considered. The main contribution is the demonstration that the application-oriented fault tolerance paradigm is applicable to problems of a noniterative-convergent nature
Keywords :
fault tolerant computing; parallel algorithms; parallel programming; programming theory; sorting; application-oriented fault tolerance; bitonic sort/merge; fault-tolerant parallel sorting algorithm; host sorting; n-cube; noniterative convergence; performance penalty; processor/link failure; reliable distributed sorting; Algorithm design and analysis; Application software; Fault detection; Fault tolerance; Hardware; Parallel algorithms; Peer to peer computing; Performance evaluation; Sorting; Testing;
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on