Title :
Scalable Distributed Consensus to Support MPI Fault Tolerance
Author :
Buntinas, Darius
Author_Institution :
Argonne Nat. Lab., Argonne, IL, USA
Abstract :
As system sizes increase, the amount of time in which an application can run without experiencing a failure decreases. Exascale applications will need to address fault tolerance. In order to support algorithm-based fault tolerance, communication libraries will need to provide fault-tolerance features to the application. One important fault-tolerance operation is distributed consensus. This is used, for example, to collectively decide on a set of failed processes. This paper describes a scalable, distributed consensus algorithm that is used to support new MPI fault-tolerance features proposed by the MPI 3 Forum´s fault-tolerance working group. The algorithm was implemented and evaluated on a 4,096-core Blue Gene/P. The implementation was able to perform a full-scale distributed consensus in 222 μs and scaled logarithmically.
Keywords :
application program interfaces; fault tolerant computing; message passing; 4096-core Blue Gene-P; MPI3 Forum fault-tolerance working group; algorithm-based fault tolerance; communication libraries; exascale applications; scalable distributed consensus algorithm; Checkpointing; Detectors; Fault tolerance; Fault tolerant systems; Libraries; Proposals; Semantics;
Conference_Titel :
Parallel & Distributed Processing Symposium (IPDPS), 2012 IEEE 26th International
Conference_Location :
Shanghai
Print_ISBN :
978-1-4673-0975-2
DOI :
10.1109/IPDPS.2012.113