DocumentCode :
144596
Title :
Dynamic memory optimization and parallelism management for OpenCL
Author :
Chao-Hung Hsu ; I-Wei Wu ; Shann, Jean Jyh-Jiun
Author_Institution :
Dept. of Comput. Sci., Nat. Chiao Tung Univ., Hsinchu, Taiwan
Volume :
2
fYear :
2014
fDate :
26-28 April 2014
Firstpage :
776
Lastpage :
780
Abstract :
Recently, multiprocessor platforms have become trends for achieving high performance. OpenCL (Open Computing Language) is one of the programming standards for heterogeneous multiprocessors, and provides portability for these platforms. Our research focuses on platforms with CPUs and GPUs since GPUs are now widespread in use. On such a platform, two programming issues may affect the performance on GPU computing significantly. One is the work load distribution and another is the employment of GPU memory hierarchy. To fully utilize the characteristics of GPUs, programmers have to be not only proficient at parallel programming but also familiar with hardware specifications. Therefore, in this paper, we propose a compilation pass to automatically perform optimizations for OpenCL kernels. Our compilation pass will transform an input naïve kernel function with optimizations, including kernel function analysis, work-group rearrangement, memory coalescing, and work-item merge. In addition, our framework is implemented on a runtime system so that it may dynamically adjust the optimizing parameters according to the hardware specifications. Considering the execution time, the optimized kernels generated by our design may have significant performance improvement over the naïve versions. Although the optimizations performed in runtime may incur time overheads, the overheads may be covered by intensive kernel computation or massive input data in most cases.
Keywords :
graphics processing units; multiprocessing systems; operating system kernels; optimising compilers; parallel programming; software performance evaluation; storage management; CPU; GPU memory hierarchy; Open Computing Language; OpenCL; OpenCL kernel optimization; compilation pass; dynamic memory optimization; dynamic optimizing parameter adjustment; hardware specifications; heterogeneous multiprocessors; kernel computation; kernel function analysis; memory coalescing; multiprocessor platforms; naive kernel function; parallel programming; parallelism management; performance improvement; portability; runtime system; work load distribution; work-group rearrangement; work-item merging; Graphics processing units; Kernel; Memory management; Optimization; Parallel processing; Random access memory; Registers; GPU; LLVM; OpenCL; dynamic optimization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Science, Electronics and Electrical Engineering (ISEEE), 2014 International Conference on
Conference_Location :
Sapporo
Print_ISBN :
978-1-4799-3196-5
Type :
conf
DOI :
10.1109/InfoSEEE.2014.6947772
Filename :
6947772
Link To Document :
بازگشت