DocumentCode :
2787236
Title :
Stack Trace Analysis for Large Scale Debugging
Author :
Arnold, Dorian C. ; Ahn, Dong H. ; De Supinski, Bronis R. ; Lee, Gregory L. ; Miller, Barton P. ; Schulz, Martin
Author_Institution :
Dept. of Comput. Sci., Wisconsin Univ., Madison, WI
fYear :
2007
fDate :
26-30 March 2007
Firstpage :
1
Lastpage :
10
Abstract :
We present the Stack Trace Analysis Tool (STAT) to aid in debugging extreme-scale applications. STAT can reduce problem exploration spaces from thousands of processes to a few by sampling stack traces to form process equivalence classes, groups of processes exhibiting similar behavior. We can then use full-featured debuggers on representatives from these behavior classes for root cause analysis. STAT scalably collects stack traces over a sampling period to assemble a profile of the application´s behavior. STAT routines process the samples to form a call graph prefix tree that encodes common behavior classes over the program´s process space and time. STAT leverages MRNet, an infrastructure for tool control and data analyses, to overcome scalability barriers faced by heavy-weight debuggers. We present STAT´s design and an evaluation that shows STAT gathers informative process traces from thousands of processes with sub-second latencies, a significant improvement over existing tools. Our case studies of production codes verify that STAT supports the quick identification of errors that were previously difficult to locate.
Keywords :
parallel programming; program debugging; program diagnostics; software libraries; software tools; trees (mathematics); STAT routines; Stack Trace Analysis Tool; call graph prefix tree; large scale debugging; parallel application; root cause analysis; Assembly; Data analysis; Debugging; Delay; Large-scale systems; Production; Sampling methods; Scalability; Space exploration; Tree graphs;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium, 2007. IPDPS 2007. IEEE International
Conference_Location :
Long Beach, CA
Print_ISBN :
1-4244-0910-1
Electronic_ISBN :
1-4244-0910-1
Type :
conf
DOI :
10.1109/IPDPS.2007.370254
Filename :
4227982
Link To Document :
بازگشت