Title :
Extensible resource management for cluster computing
Author :
Islam, Nayeem ; Prodromidis, A.L. ; Squillante, Mark S. ; Fong, Liana L. ; Gopa, Ajei S.
Author_Institution :
Res. Div., IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
Abstract :
Advanced general purpose parallel systems should be able to support diverse applications with different resource requirements without compromising effectiveness and efficiency. We present a resource management model for cluster computing that allows multiple scheduling policies to co-exist dynamically. In particular, we have built Octopus, an extensible and distributed hierarchical scheduler that implements new space sharing, gang scheduling and load sharing strategies. A series of experiments performed on an IBM SP2 suggest that Octopus can effectively match application requirements to available resources, and improve the performance of a variety of parallel applications within a cluster
Keywords :
parallel algorithms; parallel machines; processor scheduling; resource allocation; IBM SP2; Octopus; advanced general purpose parallel systems; application requirements; cluster computing; distributed hierarchical scheduler; diverse applications; extensible resource management; gang scheduling; load sharing strategies; multiple scheduling policies; parallel applications; resource management model; resource requirements; space sharing; Application software; Computer science; Concurrent computing; Delay; Dynamic scheduling; Performance evaluation; Processor scheduling; Resource management; Time sharing computer systems; Yarn;
Conference_Titel :
Distributed Computing Systems, 1997., Proceedings of the 17th International Conference on
Conference_Location :
Baltimore, MD
Print_ISBN :
0-8186-7813-5
DOI :
10.1109/ICDCS.1997.603418