Title :
Coordinated Cooperative Work Using Undependable Processors with Unreliable Broadcast
Author :
Davtyan, Seda ; De Prisco, Roberto ; Georgiou, Chryssis ; Shvartsman, Alexander A.
Author_Institution :
Univ. of Connecticut, Storrs, CT, USA
Abstract :
With the end of Moore´s Law in sight, parallelism became the main means for speeding up computationally intensive applications, especially in the cases where large collections of tasks need to be performed. Network supercomputing -- taking advantage of very large numbers of computers in a distributed environment is an effective approach to massive parallelism that harnesses the processing power inherent in large networked settings. In such settings, processor failures are no longer an exception, but the norm. Any algorithm designed for realistic settings must be able to deal with failures. This paper presents a new message-passing algorithm for distributed cooperative work in synchronous settings where processors may crash, and where any broadcasts performed by crashing processors are unreliable. We specify the algorithm, prove that it is correct, and perform extensive simulations that show that its performance is close to similar algorithms that use reliable broadcast, and that its work compares favorably to the relevant lower bounds.
Keywords :
distributed algorithms; message passing; coordinated cooperative work; distributed cooperative work; message-passing algorithm; network supercomputing; undependable processors; Algorithm design and analysis; Complexity theory; Computer crashes; Heuristic algorithms; Partitioning algorithms; Program processors; Reliability; distributed algorithms; fault-tolerance; processor crashes; task computing; unreliable broadcast;
Conference_Titel :
Parallel, Distributed and Network-Based Processing (PDP), 2014 22nd Euromicro International Conference on
Conference_Location :
Torino
DOI :
10.1109/PDP.2014.11