Author :
Li, Dong-Xiao ; Zheng, Wei ; Zhang, Ming
Abstract :
Motion estimation (ME) is the most critical component of a video coding system, and it also dominates the major part of computation complexity and memory bandwidth. For H.264/AVC integer motion estimation (IME), this paper presents a novel memory-access and computation efficient full-search block-matching hardware architecture. With the highest level of on-chip data reuse, one-access for off-chip reference pixels is achieved, and the off-chip memory bandwidth is thus minimized. By distributed data caching and virtual connection of reference picture boundaries, the data traffic scheduling is simple, regular and efficient. The computation engine employs a two-dimensional (2-D) systolic processor array to calculate the absolute differences in single-instruction multiple-data (SIMD) manner, and 2-D adder trees to sum up the absolute differences, all with 100% utilization. The proposed architecture fully supports variable block-size matching of H.264/AVC, and can produce 41 sums of absolute differences (SADs) for one search point every cycle without bubble. The architecture is described in parameterized design, and an implementation for standard-definition digital TV encoding applications is presented. Theoretical analysis and experimental results show that, the proposed architecture can achieve the minimum off-chip memory bandwidth and the maximum computational performance.
Keywords :
VLSI; cache storage; computational complexity; image matching; motion estimation; parallel processing; tree searching; video coding; 2D adder trees; 2D systolic processor array; H.264/AVC integer motion estimation; computation complexity; data traffic scheduling; digital TV encoding; distributed data caching; full-search block-matching hardware architecture; minimum memory bandwidth; off-chip memory bandwidth; single-instruction multiple-data; video coding system; Automatic voltage control; Bandwidth; Computer architecture; Digital TV; Engines; Hardware; Motion estimation; Processor scheduling; Two dimensional displays; Video coding;