مرکز منطقه ای اطلاع رساني علوم و فناوري - Java with Auto-parallelization on Graphics Coprocessing Architecture

DocumentCode :

656183

Title :

Java with Auto-parallelization on Graphics Coprocessing Architecture

Author :

Guodong Han ; Chenggang Zhang ; King Tin Lam ; Cho-Li Wang

Author_Institution :

Dept. of Comput. Sci., Univ. of Hong Kong, Hong Kong, China

fYear :

2013

fDate :

1-4 Oct. 2013

Firstpage :

504

Lastpage :

509

Abstract :

GPU-based many-core accelerators have gained a footing in supercomputing. Their widespread adoption yet hinges on better parallelization and load scheduling techniques to utilize the hybrid system of CPU and GPU cores easily and efficiently. This paper introduces a new user-friendly compiler framework and runtime system, dubbed Japonica, to help Java applications harness the full power of a heterogeneous system. Japonica unveils an all-round system design unifying the programming style and language for transparent use of both CPU and GPU resources, automatically parallelizing all kinds of loops and scheduling workloads efficiently across the CPU-GPU border. By means of simple user annotations, sequential Java source code will be analyzed, translated and compiled into a dual executable consisting of CUDA kernels and multiple Java threads running on GPU and CPU cores respectively. Annotated loops will be automatically split into loop chunks (or tasks) being scheduled to execute on all available GPU/CPU cores. Implementing a GPU-tailored thread-level speculation (TLS) model, Japonica supports speculative execution of loops with moderate dependency densities and privatization of loops having only false dependencies on the GPU side. Our scheduler also supports task stealing and task sharing algorithms that allow swift load redistribution across GPU and CPU. Experimental results show that Japonica, on average, can run 10x, 2.5x and 2.14x faster than the best serial (1-thread CPU), GPU-alone and CPU-alone versions respectively.

Keywords :

Java; graphics processing units; multiprocessing systems; parallel architectures; program compilers; source code (software); CPU cores; CUDA kernels; GPU cores; GPU-based many-core accelerators; GPU-tailored thread-level speculation; Japonica; Java applications; graphics coprocessing architecture; load scheduling technique; loops execution; multiple Java threads; runtime system; sequential Java source code; supercomputing; task sharing algorithms; user-friendly compiler framework; Central Processing Unit; Graphics processing units; Instruction sets; Java; Kernel; Parallel processing; Programming; GPGPU; Multi-Cores; Parallelization; Profiling; Scheduling; Speculation;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Parallel Processing (ICPP), 2013 42nd International Conference on

Conference_Location :

Lyon

ISSN :

0190-3918

Type :

conf

DOI :

10.1109/ICPP.2013.62

Filename :

6687386

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=656183