Title :
Critical-path-based message logging for incremental replay of message-passing programs
Author :
Netzer, Robert H B ; Subramanian, Sairam ; Xu, Jian
Author_Institution :
Dept. of Comput. Sci., Brown Univ., Providence, RI, USA
Abstract :
Debugging long-running, nondeterministic message-passing parallel programs requires incremental replay, the ability to exactly replay selected parts of an execution. To support incremental replay, we must log enough messages and checkpoint processes often enough to allow any requested replay to complete quickly. We present an adaptive tracing strategy to keep the message-logging overhead down. We let the user specify a bound on the maximum time any replay request is allowed to take. Our algorithm tracks what each process´s critical path will be during a replay and logs enough messages to ensure the critical path will never exceed the bound. Overhead is kept low by not logging messages that can be recomputed during a replay. Experiments indicate that we log about 0.1-5% of the messages while still providing a reasonable bound on any replay
Keywords :
critical path analysis; data recording; message passing; parallel programming; program debugging; adaptive tracing strategy; checkpoint process logging; critical path; critical-path-based message logging; incremental replay; maximum time bound; message logging; message-logging overhead; nondeterministic message-passing parallel programs; recomputable messages; replay request; Adaptive algorithm; Computer bugs; Computer science; Contracts; Costs; Debugging; Delay; Runtime; Upper bound;
Conference_Titel :
Distributed Computing Systems, 1994., Proceedings of the 14th International Conference on
Conference_Location :
Pozman
Print_ISBN :
0-8186-5840-1
DOI :
10.1109/ICDCS.1994.302444