Title :
Execution replay on distributed memory architectures
Author :
Leu, Eric ; Schiper, André ; Zramdini, Abdelwahab
Author_Institution :
Dept. d´´Inf., Ecole Polytech. Federale de Lausanne, Switzerland
Abstract :
Debugging parallel programs on MIMD machines is a difficult task because successive executions of the same program can lead to different behaviors. To solve this problem, a method called execution replay has been introduced, which guarantees the reexecution of a program to be equivalent to the initial execution. Most of execution replay techniques proposed until now may be named `data driven techniques´. Such techniques are relatively easy to implement in the case of the most common communication primitives. However, the time needed to record the large amount of required information is significant, which might modify the initial execution. Execution replay becomes in this case meaningless. Another class of execution replay named control driven execution replay allows one to limit the amount of recorded information. The paper presents a solution of the class control driven which realizes execution replay on distributed memory architectures. In contrary to all other proposed approaches, the technique is adapted to nonblocking primitives, and is not dependent on any form of message passing communication
Keywords :
parallel programming; program debugging; software tools; MIMD machines; control driven execution replay; distributed memory architectures; initial execution; nonblocking primitives; parallel program debugging; program reexecution; Communication system control; Debugging; Memory architecture; Message passing; Parallel machines; Prototypes;
Conference_Titel :
Parallel and Distributed Processing, 1990. Proceedings of the Second IEEE Symposium on
Conference_Location :
Dallas, TX
Print_ISBN :
0-8186-2087-0
DOI :
10.1109/SPDP.1990.143516