DocumentCode :
3455623
Title :
Managing MPICH-G2 Jobs with WebCom-G
Author :
O´Dowd, Padraig J. ; Patil, Adarsh ; Morrison, John P.
Author_Institution :
Dept. of Comput. Sci., Univ. Coll., Cork
fYear :
2005
fDate :
4-6 July 2005
Firstpage :
258
Lastpage :
266
Abstract :
This paper discusses the use of WebCom-G to handle the management & scheduling of MPICH-G2 (MPI) jobs. Users can submit their MPI applications to a WebCom-G portal via a Web interface. WebCom-G then selects the machines to execute the application on, depending on the machines available to it and the number of machines requested by the user. WebCom-G automatically & dynamically constructs a RSL script with the selected machines and schedules the job for execution on these machines. Once the MPI application has finished executing, results are stored on the portal server, where the user can collect them. A main advantage of this system is fault survival, if any of the machines fail during the execution of a job, WebCom-G can automatically handle such failures. Following a machine failure, WebCom-G can create a new RSL script with the failed machines removed, incorporate new machines (if they are available) to replace the failed ones and re-launch the job without any intervention from the user. The probability of failures in a grid environment is high, so fault survival becomes an important issue
Keywords :
Internet; fault tolerant computing; grid computing; message passing; portals; processor scheduling; MPI; MPICH-G2 jobs; RSL script; WebCom-G; fault survival; grid portals; job scheduling; Application software; Computer science; Dynamic scheduling; Educational institutions; Grid computing; Operating systems; Portals; Processor scheduling; Software standards; Software tools; Grid Portals; MPI; MPICH-G2; Scheduling and Fault Survival.; WebCom-GI Globus;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Computing, 2005. ISPDC 2005. The 4th International Symposium on
Conference_Location :
Lille
Print_ISBN :
0-7695-2434-6
Type :
conf
DOI :
10.1109/ISPDC.2005.34
Filename :
1609978
Link To Document :
بازگشت