DocumentCode :
2958332
Title :
SyncChecker: Detecting Synchronization Errors between MPI Applications and Libraries
Author :
Chen, Zhezhe ; Li, Xinyu ; Chen, Jau-Yuan ; Zhong, Hua ; Qin, Feng
Author_Institution :
Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
fYear :
2012
fDate :
21-25 May 2012
Firstpage :
342
Lastpage :
353
Abstract :
While improving the performance, nonblocking communication is prone to synchronization errors between MPI applications and the underlying MPI libraries. Such synchronization error occurs in the following way. After initiating nonblocking communication and performing overlapped computation, the MPI application reuses the message buffer before the MPI library completes the use of the same buffer, which may lead to sending out corrupted message data or reading undefined message data. This paper presents a new method called Sync Checker to detect synchronization errors in MPI nonblocking communication. To examine whether the use of message buffers is well synchronized between the MPI application and the MPI library, Sync Checker first tracks relevant memory accesses in the MPI application and corresponding message send/receive operations in the MPI library. Then it checks whether the correct execution order between the MPI application and the MPI library is enforced by the MPI completion check routines. If not, Sync Checker reports the error with diagnostic information. To reduce runtime overhead, we propose three dynamic optimizations. We have implemented a prototype of Sync Checker on Linux and evaluated it with seven bug cases, i.e., five introduced by the original developers and two injected, in four different MPI applications. Our experiments show that Sync Checker detects all the evaluated synchronization errors and provides helpful diagnostic information. Moreover, our experiments with seven NAS Parallel Benchmarks demonstrate that Sync Checker incurs moderate runtime overhead, 1.3-9.5 times with an average of 5.2 times, making it suitable for software testing.
Keywords :
Linux; application program interfaces; message passing; program verification; software libraries; Linux; MPI applications; MPI libraries; MPI nonblocking communication; SyncChecker; message buffer; synchronization errors detection; Computer bugs; Instruments; Libraries; Prototypes; Runtime; Semantics; Synchronization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel & Distributed Processing Symposium (IPDPS), 2012 IEEE 26th International
Conference_Location :
Shanghai
ISSN :
1530-2075
Print_ISBN :
978-1-4673-0975-2
Type :
conf
DOI :
10.1109/IPDPS.2012.40
Filename :
6267848
Link To Document :
بازگشت