Title :
An efficient parallel motion estimation algorithm and X264 parallelization in CUDA
Author :
Ko, Youngsub ; Yi, Youngmin ; Ha, Soonhoi
Author_Institution :
Sch. of EECS, Seoul Nat. Univ., Seoul, South Korea
Abstract :
H.264/AVC video encoders have been widely used for its high coding efficiency. Since the computational demand proportional to the frame resolution is constantly increasing, it has been of great interest to accelerate H.264/AVC by parallel processing. Recently, graphics processing units (GPUs) have emerged as a viable target for accelerating general purpose applications by exploiting fine-grain data parallelisms. Despite extensive research effort to use GPUs to accelerate the H.264/AVC algorithm, it has not been successful to achieve any speed-up over the x264 algorithm that is known as the fastest CPU implementation because of significant communication overhead between the host CPU and the GPU and intra-frame dependency in the algorithm. In this paper, we propose a novel motion estimation (ME) algorithm tailored for NVIDIA GPU implementation. It is accompanied by a novel pipelining technique, called sub-frame ME processing, to effectively hide the communication overhead between the host CPU and the GPU. The proposed H.264 encoder achieves more than 20% speed-up compared with x264.
Keywords :
graphics processing units; motion estimation; video codecs; video coding; CUDA; H.264/AVC video encoders; NVIDIA GPU; X264 parallelization; efficient parallel motion estimation algorithm; graphics processing units; high coding efficiency; pipelining technique; subframe ME processing; Algorithm design and analysis; Encoding; Graphics processing unit; Instruction sets; Kernel; Parallel processing; Vectors; CUDA; GPU; H.264; Motion Estimation;
Conference_Titel :
Design and Architectures for Signal and Image Processing (DASIP), 2011 Conference on
Conference_Location :
Tampere
Print_ISBN :
978-1-4577-0620-2
Electronic_ISBN :
978-1-4577-0619-6
DOI :
10.1109/DASIP.2011.6136860