Abstract :
While in the last decade image and video processing (IVP) have gradually moved from special purpose computer architectures based on massive parallelism (MP) to general purpose computer architectures based on instruction-level parallelism (ILP), a new challenge is now to be faced by the IVP community, namely the application of IVP also in small-size embedded systems (e.g., video players, smart cameras, digital diaries, etc.) based on ILP processors. Because of the requirements of low size, weight, and power consumption, these embedded systems do not take advantage of processors that feature advanced dynamic code optimization mechanisms such as those based on instruction reordering and register renaming. On the other hand, the compile time techniques of present generation compilers do not appear to be aggressive enough to exploit the massive parallelism of IVP tasks in ILP architectures, thus leading to inefficient programs. This paper analyzes the efficiency of IVP programs on ILP CPUs. In particular it presents: (1) a reference model for the efficient design and implementation of highly parallel programs, such as the ones of the IVP domain; (2) an analysis of the inefficiencies of IVP programs implemented on ILP processors; and (3) a set of techniques, deriving from the reference model, that overcome these inefficiencies. These techniques are based on a novel computing paradigm called bucket processing.
Keywords :
image processing; parallel architectures; parallelising compilers; code optimization; compilers; embedded systems; image and video processing; instruction-level parallelism; massive parallelism; source level optimization; Application software; Computer aided instruction; Computer architecture; Concurrent computing; Embedded computing; Embedded system; Energy consumption; Parallel processing; Personal communication networks; Smart cameras;