DocumentCode :
2379343
Title :
Accelerating SVMs by integrating GPUs into MapReduce clusters
Author :
Herrero-Lopez, Sergio
Author_Institution :
Intell. Eng. Syst. Lab., Massachusetts Inst. of Technol., Cambridge, MA, USA
fYear :
2011
fDate :
9-12 Oct. 2011
Firstpage :
1298
Lastpage :
1305
Abstract :
The uninterrupted growth of information repositories has progressively lead data-intensive applications, such as MapReduce-based systems, to the mainstream. The MapReduce paradigm has frequently proven to be a simple yet flexible and scalable technique to distribute algorithms across thousands of nodes and petabytes of information. Under these circumstances, classic data mining algorithms have been adapted to this model, in order to run in production environments. Unfortunately, the high latency nature of this architecture has relegated the applicability of these algorithms to batch-processing scenarios. In spite of this shortcoming, the emergence of massively threaded shared-memory multiprocessors, such as Graphics Processing Units (GPU), on the commodity computing market has enabled these algorithms to be executed orders of magnitude faster, while keeping the same MapReduce based model. In this paper, we propose the integration of massively threaded shared-memory multiprocessors into MapReduce-based clusters creating a unified heterogeneous architecture that enables executing Map and Reduce operators on thousands of threads across multiple GPU devices and nodes, while maintaining the built-in reliability of the baseline system. For this purpose, we created a programming model that facilitates the collaboration of multiple CPU cores and multiple GPU devices towards the resolution of a data intensive problem. In order to prove the potential of this hybrid system, we take a popular NP-Hard supervised learning algorithm, the Support Vector Machine (SVM) and show that a 36x - 192x speedup can be achieved on large datasets without changing the model or leaving the commodity hardware paradigm.
Keywords :
computational complexity; coprocessors; data mining; learning (artificial intelligence); shared memory systems; support vector machines; GPU; MapReduce clusters; MapReduce paradigm; MapReduce-based systems; NP-hard supervised learning; SVM; data mining; data-intensive applications; graphics processing units; information repositories; massively threaded shared-memory multiprocessors; programming model; support vector machine; Computational modeling; Computer architecture; Graphics processing unit; Instruction sets; Message systems; Parallel processing; Support vector machines; Multiprocessing; Parallel Algorithms; Pattern Classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man, and Cybernetics (SMC), 2011 IEEE International Conference on
Conference_Location :
Anchorage, AK
ISSN :
1062-922X
Print_ISBN :
978-1-4577-0652-3
Type :
conf
DOI :
10.1109/ICSMC.2011.6083839
Filename :
6083839
Link To Document :
بازگشت