DocumentCode :
2852694
Title :
Replay-Based Synchronization of Timestamps in Event Traces of Massively Parallel Applications
Author :
Becker, Daniel ; Linford, John C. ; Rabenseifner, Rolf ; Wolf, Felix
Author_Institution :
Inst. for Adv. Simulation, Forschungszentrum Julich, Julich
fYear :
2008
fDate :
8-12 Sept. 2008
Firstpage :
212
Lastpage :
219
Abstract :
Event traces are helpful in understanding the performance behavior of message-passing applications since they allow in-depth analyses of communication and synchronization patterns. However, the absence of synchronized hardware clocks may render the analysis ineffective because inaccurate relative event timings can misrepresent the logical event order and lead to errors when quantifying the impact of certain behaviors. Although linear offset interpolation can restore consistency to some degree, inaccuracies and time-dependent drifts may still disarrange the original succession of events - especially during longer runs. In our earlier work, we have presented an algorithm that removes the remaining violations of the logical event order postmortem and, in addition, have outlined the initial design of a parallel version. Here, we complete the parallel design and describe its implementation within the SCALASCA trace-analysis framework. We demonstrate its suitability for large-scale applications running on more than a thousand application processes and show how the correction can improve the trace analysis of a real-world application example.
Keywords :
message passing; parallel processing; program diagnostics; software performance evaluation; synchronisation; SCALASCA trace-analysis framework; communication patterns analysis; event tracing; linear offset interpolation; logical event order postmortem; massively parallel applications; message-passing applications; performance behavior; replay-based synchronization; synchronization patterns analysis; timestamps; Algorithm design and analysis; Application software; Clocks; Computer science; Interpolation; Large-scale systems; Performance analysis; Scalability; Synchronization; Timing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Processing - Workshops, 2008. ICPP-W '08. International Conference on
Conference_Location :
Portland, OR
ISSN :
1530-2016
Print_ISBN :
978-0-7695-3375-9
Electronic_ISBN :
1530-2016
Type :
conf
DOI :
10.1109/ICPP-W.2008.17
Filename :
4626803
Link To Document :
بازگشت