DocumentCode :
3255319
Title :
Workload analysis and efficient OpenCL-based implementation of SIFT algorithm on a smartphone
Author :
Guohui Wang ; Rister, Blaine ; Cavallaro, J.R.
Author_Institution :
Dept. of Electr. & Comput. Eng., Rice Univ., Houston, TX, USA
fYear :
2013
fDate :
3-5 Dec. 2013
Firstpage :
759
Lastpage :
762
Abstract :
Feature detection and extraction are essential in computer vision applications such as image matching and object recognition. The Scale-Invariant Feature Transform (SIFT) algorithm is one of the most robust approaches to detect and extract distinctive invariant features from images. However, high computational complexity makes it difficult to apply the SIFT algorithm to mobile applications. Recent developments in mobile processors have enabled heterogeneous computing on mobile devices, such as smartphones and tablets. In this paper, we present an OpenCL-based implementation of the SIFT algorithm on a smartphone, taking advantage of the mobile GPU. We carefully analyze the SIFT workloads and identify the parallelism. We implemented major steps of the SIFT algorithm using both serial C++ code and OpenCL kernels targeting mobile processors, to compare the performance of different workflows. Based on the profiling results, we partition the SIFT algorithm between the CPU and GPU in a way that best exploits the parallelism and minimizes the buffer transferring time to achieve better performance. The experimental results show that we are able to achieve 8.5 FPS for keypoints detection and 19 FPS for descriptor generation without reducing the number and the quality of the keypoints. Moreover, the heterogeneous implementation can reduce energy consumption by 41% compared to an optimized CPU-only implementation.
Keywords :
C++ language; computational complexity; computer vision; feature extraction; graphics processing units; mobile computing; open systems; power aware computing; smart phones; software performance evaluation; transforms; OpenCL kernels; OpenCL-based implementation; SIFT algorithm; SIFT workload analysis; buffer transferring time; computational complexity; computer vision applications; descriptor generation; distinctive invariant feature detection; distinctive invariant feature extraction; energy consumption; heterogeneous computing; keypoint detection; mobile GPU; mobile devices; mobile processors; optimized CPU-only implementation; scale-invariant feature transform algorithm; serial C++ code; smartphones; tablets; Algorithm design and analysis; Feature extraction; Graphics processing units; Kernel; Mobile communication; Partitioning algorithms; CPU-GPU algorithm partitioning; GPU; OpenCL; SIFT; mobile SoC;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Global Conference on Signal and Information Processing (GlobalSIP), 2013 IEEE
Conference_Location :
Austin, TX
Type :
conf
DOI :
10.1109/GlobalSIP.2013.6737002
Filename :
6737002
Link To Document :
بازگشت