Title :
Lightweight cooperative logging for fault replication in concurrent programs
Author :
Machado, Nuno ; Romano, Paolo ; Rodrigues, Luís
Author_Institution :
Inst. Super. Tecnico, Univ. Tec. de Lisboa, Lisbon, Portugal
Abstract :
This paper presents CoopREP, a system that provides support for fault replication of concurrent programs, based on cooperative recording and partial log combination. CoopREP employs partial recording to reduce the amount of information that a given program instance is required to store in order to support deterministic replay. This allows to substantially reduce the overhead imposed by the instrumentation of the code, but raises the problem of finding the combination of logs capable of replaying the fault. CoopREP tackles this issue by introducing several innovative statistical analysis techniques aimed at guiding the search of partial logs to be combined and used during the replay phase. CoopREP has been evaluated using both standard benchmarks for multi-threaded applications and a real-world application. The results highlight that CoopREP can successfully replay concurrency bugs involving tens of thousands of memory accesses, reducing logging overhead with respect to state of the art non-cooperative logging schemes by up to 50 times in computationally intensive applications.
Keywords :
concurrency control; fault diagnosis; multi-threading; program debugging; statistical analysis; system monitoring; CoopREP; code instrumentation; concurrency bugs; concurrent programs; cooperative logging; cooperative recording; deterministic replay; fault replication; logging overhead reduction; memory access; multithreaded applications; partial log combination; partial recording; real-world application; statistical analysis techniques; Computer bugs; Concurrent computing; Instruction sets; Instruments; Measurement; Production; Vectors; concurrency errors; debugging; deterministic replay; performance;
Conference_Titel :
Dependable Systems and Networks (DSN), 2012 42nd Annual IEEE/IFIP International Conference on
Conference_Location :
Boston, MA
Print_ISBN :
978-1-4673-1624-8
Electronic_ISBN :
1530-0889
DOI :
10.1109/DSN.2012.6263953