DocumentCode
3079188
Title
An Evaluation of Unified Memory Technology on NVIDIA GPUs
Author
Wenqiang Li ; Guanghao Jin ; Xuewen Cui ; See, Simon
Author_Institution
Center for High Performance Comput., Shanghai Jiao Tong Univ., Shanghai, China
fYear
2015
fDate
4-7 May 2015
Firstpage
1092
Lastpage
1098
Abstract
Unified Memory is an emerging technology which is supported by CUDA 6.X. Before CUDA 6.X, the existing CUDA programming model relies on programmers to explicitly manage data between CPU and GPU and hence increases programming complexity. CUDA 6.X provides a new technology which is called as Unified Memory to provide a new programming model that defines CPU and GPU memory space as a single coherent memory (imaging as a same common address space). The system manages data access between CPU and GPU without explicit memory copy functions. This paper is to evaluate the Unified Memory technology through different applications on different GPUs to show the users how to use the Unified Memory technology of CUDA 6.X efficiently. The applications include Diffusion3D Benchmark, Parboil Benchmark Suite, and Matrix Multiplication from the CUDA SDK Samples. We changed those applications to corresponding Unified Memory versions and compare those with the original ones. We selected the NVIDIA Keller K40 and the Jetson TK1, which can represent the latest GPUs with Keller architecture and the first mobile platform of NVIDIA series with Keller GPU. This paper shows that Unified Memory versions cause 10% performance loss on average. Furthermore, we used the NVIDIA Visual Profiler to dig the reason of the performance loss by the Unified Memory technology.
Keywords
graphics processing units; parallel architectures; storage management; CPU; CUDA 6.X; CUDA SDK samples; CUDA programming model; Diffusion3D benchmark; Jetson TK1; Keller architecture; NVIDIA GPUs; NVIDIA Keller K40; NVIDIA visual profiler; data management; matrix multiplication; mobile platform; parboil benchmark suite; programming complexity; single coherent memory; unified memory technology; Benchmark testing; Computational modeling; Graphics processing units; Kernel; Memory management; Programming; Random access memory; CUDA programming model; Heterogeneous Computing; Unified Memory;
fLanguage
English
Publisher
ieee
Conference_Titel
Cluster, Cloud and Grid Computing (CCGrid), 2015 15th IEEE/ACM International Symposium on
Conference_Location
Shenzhen
Type
conf
DOI
10.1109/CCGrid.2015.105
Filename
7152596
Link To Document