Title :
A Self-Optimizing Computation Partitioning Algorithm for Distributed Many-Task Computing
Author :
Yu, Huashan ; Li, Yingnan ; Wu, Xianguo ; Xiao, Jian ; Li, Xiaoming
Author_Institution :
Sch. of Comput. Sci. & Electron. Eng., Peking Univ., Beijing, China
Abstract :
Many-task computing (MTC) is a practical paradigm for developing loosely coupled and complex scientific applications. In this paradigm, computation on a large dataset is decomposed into tasks that are expected to be executed in parallel with dynamically allocated computing resources. These tasks pass data via files, and each one is to execute an existing program on one dataset element. Task scheduling is a key issue to enable MTC on parallel platforms like large-scale clusters, Grids and Clouds. Current solutions mainly focus on maximizing the number of utilized parallel computing resources. This paper proposes a configurable MTC model that aims to minimize a MTC computation´s turnaround time cost with as few resources as possible. The primary strategy is to coalesce tasks with application-specific expertise into task-sequences, and assign tasks on granularity of task-sequences. Based on this model, a self-optimizing task partitioning algorithm has been devised for scheduling tasks in MTC. It separates task assignment from resource allocation, and makes a tradeoff between maximizing utilized resources, balancing workload and reducing computation-scheduling overhead. The algorithm has been implemented in Harmonia, which is a software platform developed by Peking University for enabling MTC on large-scale distributed platforms. Both the configurable MTC model and the self-optimizing task partitioning algorithm were evaluated with the genome alternative splicing application, and experimental results have proved the model´s practicability.
Keywords :
parallel processing; resource allocation; task analysis; MTC model; Peking University; distributed many task computing; large scale cluster; parallel computing resource; resource allocation; self optimizing computation partitioning algorithm; task scheduling; task sequence; Bioinformatics; Complexity theory; Computational modeling; Computers; Partitioning algorithms; Program processors; Resource management; parallel performance; resource utilization; task scheduling; workload balance;
Conference_Titel :
ChinaGrid Conference (ChinaGrid), 2010 Fifth Annual
Conference_Location :
Guangzhou
Print_ISBN :
978-1-4244-7543-8
Electronic_ISBN :
978-1-4244-7544-5
DOI :
10.1109/ChinaGrid.2010.51