مرکز منطقه ای اطلاع رساني علوم و فناوري - Using Fermi Architecture Knowledge to Speed up CUDA and OpenCL Programs

DocumentCode :

3090858

Title :

Using Fermi Architecture Knowledge to Speed up CUDA and OpenCL Programs

Author :

Torres, Yuri ; Gonzalez-Escribano, Arturo ; Llanos, Diego R.

Author_Institution :

Dipt. Inf., Univ. Valladolid, Valladolid, Spain

fYear :

2012

fDate :

10-13 July 2012

Firstpage :

617

Lastpage :

624

Abstract :

The NVIDIA graphics processing units (GPUs) are playing an important role as general purpose programming devices. The implementation of parallel codes to exploit the GPU hardware architecture is a task for experienced programmers. The threadblock size and shape choice is one of the most important user decisions when a parallel problem is coded. The threadblock configuration has a significant impact on the global performance of the program. While in CUDA parallel programming model it is always necessary to specify the threadblock size and shape, the OpenCL standard also offers an automatic mechanism to take this delicate decision. In this paper we present a study of these criteria for Fermi architecture, introducing a general approach for threadblock choice, and showing that there is considerable room for improvement in OpenCL automatic strategy.

Keywords :

graphics processing units; parallel architectures; parallel programming; CUDA parallel programming model; GPU hardware architecture; NVIDIA graphics processing units; automatic mechanism; fermi architecture knowledge; parallel problem; speed up CUDA programs; speed up OpenCL programs; threadblock configuration; threadblock size; Benchmark testing; Computer architecture; Graphics processing unit; Instruction sets; Kernel; Shape; Tuning; CUDA; Fermi; GPGPU; OpenCL; automatic code tuning;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Parallel and Distributed Processing with Applications (ISPA), 2012 IEEE 10th International Symposium on

Conference_Location :

Leganes

Print_ISBN :

978-1-4673-1631-6

Type :

conf

DOI :

10.1109/ISPA.2012.92

Filename :

6280352

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3090858