Title :
Reliable distributed sorting through the application-oriented fault tolerance paradigm
Author :
McMillin, Bruce M. ; Ni, Lionel M.
Author_Institution :
Dept. of Comput. Sci., Missouri Univ., Rolla, MO, USA
Abstract :
The design and implementation of a reliable version of the distributed bitonic sorting algorithm using the application-oriented fault tolerance paradigm on a commercial multicomputer is described. Sorting assertions in general are discussed and the bitonic sort algorithm is introduced. Faulty behavior is discussed and a fault-tolerant parallel bitonic sort developed using this paradigm is presented. The error coverage and the response of the fault-tolerant algorithm to faulty behavior are presented. Both asymptotic complexity and the results of run-time experimental measurements on an Ncube multicomputer are given. The authors demonstrate that the application-oriented fault tolerance paradigm is applicable to problems of a noniterative nature
Keywords :
distributed processing; fault tolerant computing; multiprocessing systems; sorting; Ncube multicomputer; application-oriented fault tolerance paradigm; asymptotic complexity; commercial multicomputer; design; error coverage; faulty behaviour; implementation; reliable distributed sorting; Algorithm design and analysis; Application software; Computer science; Fault detection; Fault tolerance; Hardware; Peer to peer computing; Software algorithms; Sorting; Testing;
Conference_Titel :
Distributed Computing Systems, 1989., 9th International Conference on
Conference_Location :
Newport Beach, CA
Print_ISBN :
0-8186-1953-8
DOI :
10.1109/ICDCS.1989.37983