DocumentCode
1858050
Title
Performance of CUDA Virtualized Remote GPUs in High Performance Clusters
Author
Duato, Josè ; Pena, A.J. ; Silla, Federico ; Mayo, Rafael ; Quintana-Ortí, Enrique S.
Author_Institution
Univ. Politec. de Valencia (UPV), Valencia, Spain
fYear
2011
fDate
13-16 Sept. 2011
Firstpage
365
Lastpage
374
Abstract
In a previous work we presented the architecture of rCUDA, a middleware that enables CUDA remoting over a commodity network. That is, the middleware allows an application to use a CUDA-compatible Graphics Processor (GPU) installed in a remote computer as if it were installed in the computer where the application is being executed. This approach is based on the observation that GPUs in a cluster are not usually fully utilized, and it is intended to reduce the number of GPUs in the cluster, thus lowering the costs related with acquisition and maintenance while keeping performance close to that of the fully-equipped configuration. In this paper we model rCUDA over a series of high throughput networks in order to assess the influence of the performance of the underlying network on the performance of our virtualization technique. For this purpose, we analyze the traces of two different case studies over two different networks. Using this data, we calculate the expected performance for these same case studies over a series of high throughput networks, in order to characterize the expected behavior of our solution in high performance clusters. The estimations are validated using real 1 Gbps Ethernet and 40 Gbps InfiniBand networks, showing an error rate in the order of 1% for executions involving data transfers above 40 MB. In summary, although our virtualization technique noticeably increases execution time when using a 1 Gbps Ethernet network, it performs almost as efficiently as a local GPU when higher performance interconnects are used. Therefore, the small overhead incurred by our proposal because of the remote use of GPUs is worth the savings that a cluster configuration with less GPUs than nodes reports.
Keywords
computer graphic equipment; virtual machines; workstation clusters; CUDA virtualized remote GPU; Ethernet; InfiniBand networks; bit rate 1 Gbit/s; bit rate 40 Gbit/s; graphics processor; high performance clusters; high throughput networks; rCUDA; Acceleration; Computer architecture; Graphics processing unit; Kernel; Payloads; Proposals; Servers; CUDA; Clusters; Graphics processors (GPUs); high performance computing; virtualization;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel Processing (ICPP), 2011 International Conference on
Conference_Location
Taipei City
ISSN
0190-3918
Print_ISBN
978-1-4577-1336-1
Electronic_ISBN
0190-3918
Type
conf
DOI
10.1109/ICPP.2011.58
Filename
6047204
Link To Document