DocumentCode :
1813346
Title :
Autonomic scheduling of tasks from data parallel patterns to CPU/GPU core mixes
Author :
Serban, T. ; Danelutto, M. ; Kilpatrick, P.
Author_Institution :
Dept. Comput. Sci., Univ. of Pisa, Pisa, Italy
fYear :
2013
fDate :
1-5 July 2013
Firstpage :
72
Lastpage :
79
Abstract :
We propose a methodology for optimizing the execution of data parallel (sub-)tasks on CPU and GPU cores of the same heterogeneous architecture. The methodology is based on two main components: i) an analytical performance model for scheduling tasks among CPU and GPU cores, such that the global execution time of the overall data parallel pattern is optimized; and ii) an autonomic module which uses the analytical performance model to implement the data parallel computations in a completely autonomic way, requiring no programmer intervention to optimize the computation across CPU and GPU cores. The analytical performance model uses a small set of simple parameters to devise a partitioning-between CPU and GPU cores-of the tasks derived from structured data parallel patterns/algorithmic skeletons. The model takes into account both hardware related and application dependent parameters. It computes the percentage of tasks to be executed on CPU and GPU cores such that both kinds of cores are exploited and performance figures are optimized. The autonomic module, implemented in FastFlow, executes a generic map (reduce) data parallel pattern scheduling part of the tasks to the GPU and part to CPU cores so as to achieve optimal execution time. Experimental results on state-of-the-art CPU/GPU architectures are shown that assess both performance model properties and autonomic module effectiveness.
Keywords :
fault tolerant computing; graphics processing units; multiprocessing systems; parallel architectures; task analysis; CPU-GPU architectures; CPU-GPU core mixes; FastFlow; autonomic module effectiveness; autonomic task scheduling; data parallel pattern scheduling part; data parallel patterns; generic map; heterogeneous architecture; optimal execution time; performance figures; performance model properties; programmer intervention; Computational modeling; Data models; Graphics processing units; Multicore processing; Parallel processing; Skeleton; GPU; autonomic computing; data parallelism; parallel design patterns;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing and Simulation (HPCS), 2013 International Conference on
Conference_Location :
Helsinki
Print_ISBN :
978-1-4799-0836-3
Type :
conf
DOI :
10.1109/HPCSim.2013.6641395
Filename :
6641395
Link To Document :
بازگشت