Title :
A low complexity H.264/AVC 4×4 intra prediction architecture with macroblock/block reordering
Author :
Orlandic, M. ; Svarstad, K.
Author_Institution :
Dept. of Electron. & Telecommun., Norwegian Univ. of Sci. & Technol., Trondheim, Norway
Abstract :
The H.264/AVC standard possesses high complexity features, among which intra prediction characterized by high data dependency and immense amount of computation. Therefore the compression in real time represents a challenge. This paper presents a novel low-complexity architecture of intra 4×4 prediction for baseline, main and high profiles H.264 frame encoders that meet different throughput requirements. Intra 4×4 prediction consists of a prediction and a reconstruction loop, and these two phases are performed in a pipeline manner by processing sixteen data items in parallel. A new macroblock level scanning order is proposed in order to increase efficiency and exploit the data dependency between blocks. Number of scenarios, such as on fly processing of macroblock rows and simultaneous processing of several macroblock rows, have been exploited depending on the input buffer capabilities. The second case requires input buffer that accommodates a number of macroblocks that correspond to the frame width. Ultra high throughput requires parallel processing of large amount of data. For proposed waveform macroblock scanning order, the intra prediction module can be used a processing element for building a complex reconfigurable design. By varying number of intra prediction modules, design can be adapted to achieve different throughput. The focus of this paper is on the architecture of the intra prediction module. In the case of on fly processing, it takes 48 cycles to process one macroblock (MB). This proposed architecture is synthesized and implemented on Kintex 705-XC7K325T board and requires 48 MHz to encode 4k×2k at 30 fps in real time, which is significant reduction of frequency requirement compared to the state of the art.
Keywords :
buffer storage; circuit complexity; parallel architectures; pipeline processing; reconfigurable architectures; video coding; H.264 frame encoders; Kintex 705-XC7K325T board; complex reconfigurable design; data dependency; fly processing; frequency requirement reduction; input buffer capabilities; intraprediction module; low complexity H.264-AVC 4×4 intra prediction architecture; macroblock-block reordering; parallel data processing; pipeline processing; prediction loop; reconstruction loop; ultra high throughput; waveform macroblock level scanning order; Clocks; Computer architecture; Discrete cosine transforms; Image reconstruction; Quantization (signal); Real-time systems; Throughput;
Conference_Titel :
Reconfigurable Computing and FPGAs (ReConFig), 2013 International Conference on
Conference_Location :
Cancun
Print_ISBN :
978-1-4799-2078-5
DOI :
10.1109/ReConFig.2013.6732306