مرکز منطقه ای اطلاع رساني علوم و فناوري - Coordinating GPU Threads for OpenMP 4.0 in LLVM

DocumentCode :

3582181

Title :

Coordinating GPU Threads for OpenMP 4.0 in LLVM

Author :

Bertolli, Carlo ; Antao, Samuel F. ; Eichenberger, Alexandre E. ; Sura, Kevin O´Brien Zehra ; Jacob, Arpith C. ; Tong Chen ; Sallenave, Olivier

Author_Institution :

IBM T.J. Watson Res. Center, Yorktown Heights, NY, USA

fYear :

2014

Firstpage :

Lastpage :

Abstract :

GPUs devices are becoming critical building blocks of High-Performance platforms for performance and energy efficiency reasons. As a consequence, parallel programming environment such as OpenMP were extended to support offloading code to such devices. OpenMP compilers are faced with offering an efficient implementation of device-targeting constructs. One main issue in implementing OpenMP on a GPU is related to efficiently supporting sequential and parallel regions, as GPUs are only optimized to execute highly parallel workloads. Multiple solutions to this issue were proposed in previous research. In this paper, we propose a method to coordinate threads in an NVIDIA GPU that is both efficient and easily integrated as part of a compiler. To support our claims, we developed CUDA programs that mimic multiple coordination schemes and we compare their performances. We show that a scheme based on dynamic parallelism performs poorly compared to inspector-executor schemes that we introduce in this paper. We also discuss how to integrate these schemes to the LLVM compiler infrastructure.

Keywords :

graphics processing units; multi-threading; parallel architectures; program compilers; CUDA programs; GPU devices; GPU threads; LLVM compiler infrastructure; NVIDIA GPU; OpenMP 4.0; OpenMP compilers; code offloading; dynamic parallelism; graphics processing unit; high-performance platforms; inspector-executor schemes; parallel programming environment; parallel regions; parallel workloads; sequential regions; Acceleration; Graphics processing units; Kernel; Parallel processing; Performance evaluation; Synchronization;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

LLVM Compiler Infrastructure in HPC (LLVM-HPC), 2014

Type :

conf

DOI :

10.1109/LLVM-HPC.2014.10

Filename :

7069297

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3582181