DocumentCode
2345501
Title
MPICH-G-DM: An Enhanced MPICH-G with Supporting Dynamic Job Migration
Author
Wei, Xiaohui ; Li, Hongliang ; Li, Dexiong
Author_Institution
Coll. of Comput. Sci. & Technol., Jilin Univ., Changchun, China
fYear
2009
fDate
21-22 Aug. 2009
Firstpage
67
Lastpage
76
Abstract
Grid is attracting more and more attentions by its massive computational capacity. Tools like Globus Toolkit and MPICH-G2 have been developed to help scientists to facilitate their researches. As a Grid-enabled implementation of MPI, MPICH-G2 helps developers to port parallel applications to cross-domain environment. Since the current computationally-intensive parallel applications, especially long-running tasks, require high availability as well as high performance computing platform, dynamic job migration in Grid environment has became an essential issue. In this study, we present a dynamic job migration enabled MPICH-G2 version, MPICH-G-DM. We use Virtual Job Model (VJM) to reserve resources for the migrating jobs in advance to improve the efficiency of the system. An Asynchronous Migration Protocol (AMP) is proposed to enable the migrating sub jobs to checkpoint/restart and update their new addresses concurrently without a global synchronization. In order to reduce the communicating overhead of job migration, MPICH-G-DM minimized the number of control messages among domains to O(N). Experiment results show that MPICH-G-DM is effective and reliable.
Keywords
application program interfaces; grid computing; message passing; software tools; Globus Toolkit; MPI; MPICH-G-DM; MPICH-G2 version; asynchronous migration protocol; dynamic job migration; grid-enabled implementation; high availability; high performance computing platform; virtual job model; Availability; Concurrent computing; Distributed computing; Educational institutions; Grid computing; High performance computing; Load management; Parallel processing; Protocols; Resource management; Grid; MPICH-G2; VJM; across domain; dynamic job migration;
fLanguage
English
Publisher
ieee
Conference_Titel
ChinaGrid Annual Conference, 2009. ChinaGrid '09. Fourth
Conference_Location
Yantai, Shandong
Print_ISBN
978-0-7695-3818-1
Type
conf
DOI
10.1109/ChinaGrid.2009.9
Filename
5328431
Link To Document