Motion Estimation Without Integer-Pel Search

Author

Ling Li ; Shaoli Liu ; Yunji Chen ; Tianshi Chen ; Tao Luo

Author_Institution

Inst. of Comput. Technol., Beijing, China

Volume

22

Issue

4

fYear

2013

fDate

Apr-13

Firstpage

1340

Lastpage

1353

Abstract

The typical motion estimation (ME) consists of three main steps, including spatial-temporal prediction, integer-pel search, and fractional-pel search. The integer-pel search, which seeks the best matched integer-pel position within a search window, is considered to be crucial for video encoding. It occupies over 50% of the overall encoding time (when adopting the full search scheme) for software encoders, and introduces remarkable area cost, memory traffic, and power consumption to hardware encoders. In this paper, we find that video sequences (especially high-resolution videos) can often be encoded effectively and efficiently even without integer-pel search. Such counter-intuitive phenomenon is not only because that spatial-temporal prediction and fractional-pel search are accurate enough for the ME of many blocks. In fact, we observe that when the predicted motion vector is biased from the optimal motion vector (mainly for boundary blocks of irregularly moving objects), it is also hard for integer-pel search to reduce the final rate-distortion cost: the deviation of reference position could be alleviated with the fractional-pel interpolation and rate-distortion optimization techniques (e.g., adaptive macroblock mode). Considering the decreasing proportion of boundary blocks caused by the increasing resolution of videos, integer-pel search may be rather cost-ineffective in the era of high-resolution. Experimental results on 36 typical sequences of different resolutions encoded with x264, which is a widely-used video encoder, comply with our analysis well. For 1080p sequences, removing the integer-pel search saves 57.9% of the overall H.264 encoding time on average (compared to the original x264 with full integer-pel search using default parameters), while the resultant performance loss is negligible: the bit-rate is increased by only 0.18%, while the peak signal-to-noise ratio is decreased by only 0.01 dB per frame averagely.

Keywords

image sequences; interpolation; motion estimation; optimisation; video coding; adaptive macroblock mode; boundary blocks; fractional-pel interpolation; fractional-pel search; high-resolution videos; motion estimation; rate-distortion optimization; spatial-temporal prediction; video encoding; video sequences; Accuracy; Interpolation; Motion estimation; Rate-distortion; Spatial resolution; Vectors; Video sequences; Integer-pel search; motion compensation; motion estimation; motion vector prediction; video coding;

fLanguage

English

Journal_Title

Image Processing, IEEE Transactions on

Publisher

ieee

ISSN

1057-7149

Type

jour

DOI

10.1109/TIP.2012.2228495

Filename

6357281