DocumentCode :
2719967
Title :
An architecture for integrated resource management of MPI jobs
Author :
Sistare, Steve ; Test, Jack ; Plauger, Dave
Author_Institution :
Sun MicroSysterms Inc., Mountain View, CA, USA
fYear :
2002
fDate :
2002
Firstpage :
370
Lastpage :
377
Abstract :
We present a new architecture for the integration of distributed resource management systems and parallel run-time environments such as MPI. The architecture solves the long-standing problem of achieving a tight integration between the two in a clean and robust manner that fully enables the functionality of both systems, including resource limit enforcement and accounting. We also present a more uniform command interface to the user, which simplifies the task of running parallel jobs and tools under a resource manager. The architecture is extensible and allows new systems to be incorporated. We describe the properties that a resource management system must have to work in this architecture, and find that these are ubiquitous in the resource management world. Using the Sun™ Cluster Runtime Environment, we show the generality of the approach by implementing tight integrations with PBS, LSF and Sun Grid Engine software, and we demonstrate the advantages of a tight integration. No modifications or enhancements to these resource management systems were required, which is in marked contrast to ad-hoc approaches which typically require such changes.
Keywords :
application program interfaces; message passing; resource allocation; workstation clusters; LSF; MPI jobs; PBS; Sun Cluster Runtime Environment; Sun Grid Engine software; command interface; distributed resource management systems; integrated resource management; parallel run-time environments; resource limit enforcement; Computer architecture; Databases; Libraries; Lifting equipment; Peer to peer computing; Resource management; Robustness; Sockets; Sun; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster Computing, 2002. Proceedings. 2002 IEEE International Conference on
Print_ISBN :
0-7695-2066-9
Type :
conf
DOI :
10.1109/CLUSTR.2002.1137769
Filename :
1137769
Link To Document :
بازگشت