مرکز منطقه ای اطلاع رساني علوم و فناوري - An Evaluation of Unified Memory Technology on NVIDIA GPUs

DocumentCode :

3079188

Title :

An Evaluation of Unified Memory Technology on NVIDIA GPUs

Author :

Wenqiang Li ; Guanghao Jin ; Xuewen Cui ; See, Simon

Author_Institution :

Center for High Performance Comput., Shanghai Jiao Tong Univ., Shanghai, China

fYear :

2015

fDate :

4-7 May 2015

Firstpage :

1092

Lastpage :

1098

Abstract :

Unified Memory is an emerging technology which is supported by CUDA 6.X. Before CUDA 6.X, the existing CUDA programming model relies on programmers to explicitly manage data between CPU and GPU and hence increases programming complexity. CUDA 6.X provides a new technology which is called as Unified Memory to provide a new programming model that defines CPU and GPU memory space as a single coherent memory (imaging as a same common address space). The system manages data access between CPU and GPU without explicit memory copy functions. This paper is to evaluate the Unified Memory technology through different applications on different GPUs to show the users how to use the Unified Memory technology of CUDA 6.X efficiently. The applications include Diffusion3D Benchmark, Parboil Benchmark Suite, and Matrix Multiplication from the CUDA SDK Samples. We changed those applications to corresponding Unified Memory versions and compare those with the original ones. We selected the NVIDIA Keller K40 and the Jetson TK1, which can represent the latest GPUs with Keller architecture and the first mobile platform of NVIDIA series with Keller GPU. This paper shows that Unified Memory versions cause 10% performance loss on average. Furthermore, we used the NVIDIA Visual Profiler to dig the reason of the performance loss by the Unified Memory technology.

Keywords :

graphics processing units; parallel architectures; storage management; CPU; CUDA 6.X; CUDA SDK samples; CUDA programming model; Diffusion3D benchmark; Jetson TK1; Keller architecture; NVIDIA GPUs; NVIDIA Keller K40; NVIDIA visual profiler; data management; matrix multiplication; mobile platform; parboil benchmark suite; programming complexity; single coherent memory; unified memory technology; Benchmark testing; Computational modeling; Graphics processing units; Kernel; Memory management; Programming; Random access memory; CUDA programming model; Heterogeneous Computing; Unified Memory;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Cluster, Cloud and Grid Computing (CCGrid), 2015 15th IEEE/ACM International Symposium on

Conference_Location :

Shenzhen

Type :

conf

DOI :

10.1109/CCGrid.2015.105

Filename :

7152596

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3079188