DocumentCode :
3079188
Title :
An Evaluation of Unified Memory Technology on NVIDIA GPUs
Author :
Wenqiang Li ; Guanghao Jin ; Xuewen Cui ; See, Simon
Author_Institution :
Center for High Performance Comput., Shanghai Jiao Tong Univ., Shanghai, China
fYear :
2015
fDate :
4-7 May 2015
Firstpage :
1092
Lastpage :
1098
Abstract :
Unified Memory is an emerging technology which is supported by CUDA 6.X. Before CUDA 6.X, the existing CUDA programming model relies on programmers to explicitly manage data between CPU and GPU and hence increases programming complexity. CUDA 6.X provides a new technology which is called as Unified Memory to provide a new programming model that defines CPU and GPU memory space as a single coherent memory (imaging as a same common address space). The system manages data access between CPU and GPU without explicit memory copy functions. This paper is to evaluate the Unified Memory technology through different applications on different GPUs to show the users how to use the Unified Memory technology of CUDA 6.X efficiently. The applications include Diffusion3D Benchmark, Parboil Benchmark Suite, and Matrix Multiplication from the CUDA SDK Samples. We changed those applications to corresponding Unified Memory versions and compare those with the original ones. We selected the NVIDIA Keller K40 and the Jetson TK1, which can represent the latest GPUs with Keller architecture and the first mobile platform of NVIDIA series with Keller GPU. This paper shows that Unified Memory versions cause 10% performance loss on average. Furthermore, we used the NVIDIA Visual Profiler to dig the reason of the performance loss by the Unified Memory technology.
Keywords :
graphics processing units; parallel architectures; storage management; CPU; CUDA 6.X; CUDA SDK samples; CUDA programming model; Diffusion3D benchmark; Jetson TK1; Keller architecture; NVIDIA GPUs; NVIDIA Keller K40; NVIDIA visual profiler; data management; matrix multiplication; mobile platform; parboil benchmark suite; programming complexity; single coherent memory; unified memory technology; Benchmark testing; Computational modeling; Graphics processing units; Kernel; Memory management; Programming; Random access memory; CUDA programming model; Heterogeneous Computing; Unified Memory;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster, Cloud and Grid Computing (CCGrid), 2015 15th IEEE/ACM International Symposium on
Conference_Location :
Shenzhen
Type :
conf
DOI :
10.1109/CCGrid.2015.105
Filename :
7152596
Link To Document :
بازگشت