DocumentCode :
3588743
Title :
Fine-grained parallel implementation of edge-directed Image Interpolation on GPU
Author :
Wenze Li ; Jiaji Wu ; Jiao Shi
Author_Institution :
Key Lab. of Intell., Perception & Image Understanding of Minist. of Educ. of China, Xidian Univ., Xi´an, China
fYear :
2014
Firstpage :
937
Lastpage :
940
Abstract :
Edge-directed interpolation is widely used to enhance visual performance of remote sensing image. Compared with traditional bi-cubic interpolation and bilinear interpolation, a great number of matrix operations will appear as it is getting better visual performance. CUDA (Compute Unified Device Architecture) offers tremendous performance in many high-performance computing areas. Edge-directed interpolation can be mapped to this architecture (CUDA) readily. However, parallel schemes based on CUDA are generally decomposed into coarse-grained tasks, which is suitable for thread blocks. In this paper, a parallel approach of fine-grained edge-directed interpolation is proposed. Based on CUDA, the process of parallel interpolation for one missing pixel is assigned to 4*4 threads for the reason that majority of matrix operations are related to 4*4 matrix. This task division strategy minimizes resource pressure of thread-blocks. Our calculating scheme is expressed in terms of increasing parallelism that is efficiently implemented on the GPU. By employing one NVIDIA GTX480 GPU and one NVIDIA GTX590 GPU in the case with asynchronous I/O transfer, our GPU optimization efforts on fine-grained edge-directed interpolation scheme finally achieve a speedup of 69.8x with respect to its CPU counterpart C code running on one CPU core of Intel core(TM) i7-920.
Keywords :
edge detection; graphics processing units; image resolution; parallel architectures; CUDA; NVIDIA GTX480 GPU; NVIDIA GTX590 GPU; asynchronous I/O transfer; compute unified device architecture; edge-directed image interpolation; fine-grained parallel implementation; image resolution; matrix operations; task division strategy; Algorithm design and analysis; Graphics processing units; Image edge detection; Instruction sets; Interpolation; Matrix decomposition; Visualization; CUDA(Compute Unified Device Architectur); GPU; edge-directed interpolation; fine-grained;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Systems (ICPADS), 2014 20th IEEE International Conference on
Type :
conf
DOI :
10.1109/PADSW.2014.7097912
Filename :
7097912
Link To Document :
بازگشت