Title :
A fault tolerant distributed parallel processing system on LAN workstations
Author :
Haghighat, Abolfazl Toroghi ; Faez, Karim
Author_Institution :
Dept. of Electr. Eng., Amirkabir Univ. of Technol., Tehran, Iran
Abstract :
Workstations of a computer network can be used to implement a parallel processing system. In this method the existing network equipment can be used without any extra hardware cost for parallel processing. The aim of this paper is the development of a fault tolerant parallel processing system on an Ethernet LAN with IBM PC workstations. One of the application of the above system is in medical image processing in a Picture Archiving and Communication System (PACS), to be used in the hospitals. For example, if a group of ultrasound images are produced in a medical examination, a noise rejection filter program must be applied on all of the images. In this system, the client has a number of input files and executable ones. Each of the server workstations must load one of these input files and its related executable one. Then, it executes the loaded task and prepares the output file for the client. This system must be fault tolerant, because if a server crashes or is turned off by its user or an infinite loop error occurs, and so on, the client must detect the fault and prevent the system failure. If a software or hardware error is detected, the uncompleted task will be put on a wait state to be loaded by another server. The number of clients or servers is arbitrary and variable during the operation of the system. We have compared two fault tolerance approaches that can be used to design the above system: Watchdog timer and periodic signal transmission. Then, the Watchdog timer was selected in conjunction with a fault counting approach to be used in the parallel processing system. A shared data structure file in the file server´s disk is used for communication between clients and servers. Each client uses this file for management of its tasks and monitoring of the operation progress. Finally, the experimental results of the system are presented
Keywords :
PACS; biomedical engineering; fault tolerant computing; file servers; local area networks; medical image processing; parallel processing; workstations; Ethernet; IBM PC workstations; LAN workstations; PACS; Picture Archiving and Communication System; Watchdog timer; computer network; experimental results; fault counting approach; fault tolerant distributed parallel processing; file management; file server disk; hospitals; medical examination; medical image processing; network equipment; noise rejection filter program; parallel processing system; periodic signal transmission; shared data structure file; ultrasound images; Computer networks; Costs; Ethernet networks; Fault tolerant systems; File servers; Hardware; Local area networks; Parallel processing; Picture archiving and communication systems; Workstations;
Conference_Titel :
Information, Communications and Signal Processing, 1997. ICICS., Proceedings of 1997 International Conference on
Print_ISBN :
0-7803-3676-3
DOI :
10.1109/ICICS.1997.652228