DocumentCode :
2958704
Title :
MATE-CG: A Map Reduce-Like Framework for Accelerating Data-Intensive Computations on Heterogeneous Clusters
Author :
Jiang, Wei ; Agrawal, Gagan
Author_Institution :
Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
fYear :
2012
fDate :
21-25 May 2012
Firstpage :
644
Lastpage :
655
Abstract :
Clusters of GPUs have rapidly emerged as the means for achieving extreme-scale, cost-effective, and powerefficient high performance computing. At the same time, high level APIs like map-reduce are being used for developing several types of high-end and/or data-intensive applications. Map-reduce, originally developed for data processing applications, has been successfully used for many classes of applications that involve a significant amount of computations, such as machine learning, image processing, and data mining applications. Because such applications can be accelerated using GPUs (and other accelerators), there has been interest in supporting map-reduce-like APIs on GPUs. However, while the use of map-reduce for a single GPU has been studied, developing map-reduce-like models for programming a heterogeneous CPU-GPU cluster remains an open challenge. This paper presents the MATE-CG system, which is a map reduce-like framework based on the generalized reduction API. We develop support for enabling scalable and efficient implementation of data-intensive applications in a heterogeneous cluster of multi-core CPUs and many-core GPUs. Our contributions are three folds: 1) we port the generalized reduction model on clusters of modern GPUs with a map-reduce-like API, dealing with very large datasets, 2) we further propose three schemes to better utilize the computing power of CPUs and/or GPUs and develop an auto-tuning strategy to achieve the best-possible heterogeneous configuration for iterative applications, 3) we show how analytical models can be used to optimize important parameters in our system. We evaluate our system using three representative data intensive applications and report results on a heterogeneous cluster of 128 CPU cores and 16 GPUs (7168 GPU cores). We show an average speedup of 87× on this cluster over execution with 2 CPU-cores. Our applications also achieve an average improvement of 25% by using CPU cores and GPUs simultaneously, over t- e best performance achieved from using only one of the types of resources in the cluster.
Keywords :
application program interfaces; graphics processing units; multiprocessing systems; parallel processing; pattern clustering; GPU cluster; MATE-CG system; Map-Reduce-like API; MapReduce-like framework; analytical models; autotuning strategy; data mining applications; data processing applications; data-intensive computation acceleration; generalized reduction API; generalized reduction model; heterogeneous CPU-GPU cluster programming; heterogeneous clusters; high level API; high performance computing; image processing; machine learning; many-core GPU; multicore CPU; Acceleration; Analytical models; Computational modeling; Data processing; Graphics processing unit; Multicore processing; Runtime; Data-Intensive Computing; GPUs; Heterogeneous Systems;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel & Distributed Processing Symposium (IPDPS), 2012 IEEE 26th International
Conference_Location :
Shanghai
ISSN :
1530-2075
Print_ISBN :
978-1-4673-0975-2
Type :
conf
DOI :
10.1109/IPDPS.2012.65
Filename :
6267866
Link To Document :
بازگشت