Title :
Scalable temporal order analysis for large scale debugging
Author :
Ahn, Dong H. ; de Supinski, Bronis R. ; Laguna, Ignacio ; Lee, Gregory L. ; Liblit, Ben ; Miller, Barton P. ; Schulz, Markus
Author_Institution :
Comput. Directorate, Lawrence Livermore Nat. Lab., Livermore, CA, USA
Abstract :
We present a scalable temporal order analysis technique that supports debugging of large scale applications by classifying MPI tasks based on their logical program execution order. Our approach combines static analysis techniques with dynamic analysis to determine this temporal order scalably. It uses scalable stack trace analysis techniques to guide selection of critical program execution points in anomalous application runs. Our novel temporal ordering engine then leverages this information along with the application´s static control structure to apply data flow analysis techniques to determine key application data such as loop control variables. We then use lightweight techniques to gather the dynamic data that determines the temporal order of the MPI tasks. Our evaluation, which extends the Stack Trace Analysis Tool (STAT), demonstrates that this temporal order analysis technique can isolate bugs in benchmark codes with injected faults as well as a real world hang case with AMG2006.
Keywords :
application program interfaces; message passing; program control structures; program debugging; program diagnostics; AMG2006; MPI task classification; MPI tasks; STAT; anomalous application runs; benchmark codes; critical program execution points; data flow analysis techniques; dynamic analysis; dynamic data; injected faults; large scale debugging; lightweight techniques; logical program execution order; loop control variables; scalable stack trace analysis techniques; scalable temporal order analysis technique; stack trace analysis tool; static analysis techniques; static control structure; temporal order scalably; temporal ordering engine;
Conference_Titel :
High Performance Computing Networking, Storage and Analysis, Proceedings of the Conference on
Conference_Location :
Portland, OR
DOI :
10.1145/1654059.1654104