Title :
Marching Band: Fault-Tolerance with Replicable Message Delivery Order
Author :
Danilecki, Arkadiusz D.
Abstract :
Marching Band ensures the same total ordering of message deliveries in each possible execution history, providing replicable execution for a subset of piecewise deterministic applications. With Marching Band any number of failures can be tolerated with a sender-based logging. The main idea behind the algorithm is to log and then broadcast each sent message, with a precomputed tag describing ordering of the message delivery.
Keywords :
checkpointing; deterministic algorithms; fault tolerant computing; message passing; execution history; fault-tolerance; marching band; piecewise deterministic applications; replicable execution; replicable message delivery order; sender-based logging; Arrays; Checkpointing; Computer crashes; Fault tolerance; Fault tolerant systems; History; Protocols; determinism; fault-tolerance; message-passing;
Conference_Titel :
Parallel, Distributed and Network-Based Processing (PDP), 2015 23rd Euromicro International Conference on
Conference_Location :
Turku
DOI :
10.1109/PDP.2015.52