Title :
Optimizing the Census Transform on CUDA enabled GPUs
Author :
Pantilie, Cosmin D. ; Nedevschi, Sergiu
Author_Institution :
Comput. Sci. Dept., Tech. Univ. of Cluj-Napoca, Cluj-Napoca, Romania
fDate :
Aug. 30 2012-Sept. 1 2012
Abstract :
The Census Transform is one of the most widely used matching metrics in problems that involve correspondence search such as stereo reconstruction and optical flow. Graphic processing units (GPUs) have become popular platforms for such computation intensive applications that expose a high degree of data parallelism. Their evolution as a platform for general purpose computing by continuously adding new hardware features has improved performance for many applications but it has also expanded the set of possible implementations choices up to the point where guidelines alone are not sufficient for optimum performance. What is the best implementation in the case of the Census Transform? This paper will answer that question by benchmarking all major possible implementations. Its aim is to provide an optimal implementation of the Census Transform on a current generation graphics processing unit using the Compute Unified Device Architecture (CUDA). The results have value reaching far beyond the Census Transform and provide insight for applications where non-separable 2D convolutions are present.
Keywords :
convolution; graphics processing units; parallel architectures; transforms; CUDA; GPU; census transform optimization; computation intensive applications; compute unified device architecture; data parallelism; general purpose computing; graphic processing units; matching metrics; nonseparable 2D convolutions; optical flow; stereo reconstruction; Graphics processing units; Instruction sets; Kernel; Memory management; Performance evaluation; Transforms; Census Transform; GPU; image processing; parallel processing;
Conference_Titel :
Intelligent Computer Communication and Processing (ICCP), 2012 IEEE International Conference on
Conference_Location :
Cluj-Napoca
Print_ISBN :
978-1-4673-2953-8
DOI :
10.1109/ICCP.2012.6356186