Title :
Design and Potential Performance of Goal-Oriented Job Scheduling Policies for Parallel Computer Workloads
Author :
Chiang, Su-Hui ; Vasupongayya, Sangsuree
Author_Institution :
Dept. of Comput. Sci., Portland State Univ., Portland, OR
Abstract :
To balance multiple scheduling performance requirements on parallel computer systems, traditional job schedulers use many parameters that can be configured to define job or queue priorities. Offering many parameters seems flexible, but in reality tuning the values for the parameters is highly challenging. To simplify the task of resource management, we propose goal-oriented policies, which allow system administrators to specify high-level performance objectives, rather than tuning low-level scheduling parameters. We study the design of goal-oriented policies, including (1) appropriate multi-objective models for specifying trade-offs between objectives, (2) efficient search algorithms for searching the best schedule at each scheduling decision point, and (3) appropriate performance measures to be optimized in the objectives with respect to two common performance requirements: preventing starvation and favoring shorter jobs. We compare goal-oriented policies with widely used backfill policies. Policies are evaluated by simulation using ten monthly workloads that ran on a Linux cluster (IA-64) from NCSA. Our results show that by automatically optimizing performance according to the given objectives through search, goal-oriented policies can simultaneously outperform FCFS-backfill and LXF-backfill, which are designed in favor of the maximum wait and average slowdown, respectively.
Keywords :
mathematical programming; parallel algorithms; processor scheduling; resource allocation; search problems; goal-oriented job scheduling policy; optimisation; parallel computer system; resource management; search algorithm; workload balancing; Backfill scheduling policies; Batch processing systems; Goal-oriented policies; Multi-objective models; Parallel systems; Scheduling; Search algorithms;
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
DOI :
10.1109/TPDS.2008.48