Title :
Optimal Placement of Application-Level Checkpoints
Author :
Wang, Panfeng ; Wang, Zhiyuan ; Du, Yunfei ; Yang, Xuejun ; Zhou, Haifang
Author_Institution :
Nat. Lab. for Parallel & Distrib. Process., Nat. Univ. of Defense Technol., Changsha
Abstract :
One of the basic problems related to the efficient application-level checkpointing is the placement of checkpoints in the source codes. In this paper we discuss two common questions with a source-to-source precompiler ALEC: 1) if there are N checkpoints in the application´s source code, how to pick M checkpoints out of them minimizing the total amount of checkpoint data? 2) if there are no checkpoint in the application´s source code, how to insert a set of checkpoints minimizing the amount of checkpoint data? We reveal that these two questions can both be abstracted as a mathematic model which is similar to the 0-1 integer programming model, and the model can be solved using implicit enumeration method. The solving methods proposed in the paper have been implemented and integrated into ALEC. Experimental results show that the method is efficient.
Keywords :
checkpointing; integer programming; minimisation; program compilers; ALEC source-to-source precompiler; application-level checkpoint placement; checkpoint data minimization; implicit enumeration method; integer programming model; mathematic model; source code; Checkpointing; Concurrent computing; Distributed processing; Educational institutions; High performance computing; Laboratories; Large-scale systems; Mathematical model; Programming profession; Writing;
Conference_Titel :
High Performance Computing and Communications, 2008. HPCC '08. 10th IEEE International Conference on
Conference_Location :
Dalian
Print_ISBN :
978-0-7695-3352-0
DOI :
10.1109/HPCC.2008.40