VLIW DSP vs. superscalar implementation of a baseline 11.263 video encoder

Author

Banerjee, Serene ; Sheikh, Humid R. ; John, Lizy K. ; Evans, Brian L. ; Bovik, Alan C.

Author_Institution

Dept. of Electr. & Comput. Eng., Texas Univ., Austin, TX, USA

Volume

2

fYear

2000

fDate

Oct. 29 2000-Nov. 1 2000

Firstpage

1665

Abstract

A Very Long Instruction Word (VLIW) processor and a superscalar processor can execute multiple instructions simultaneously. A VLIW processor depends on the compiler and programmer to find the parallelism in the instructions, whereas a superscaler processor determines the parallelism at runtime. This paper compares TI TMS320C6700 VLIW digital signal processor (DSP) and SimpleScalar superscalar implementations of a baseline 11.263 video encoder in C. With level two C compiler optimization, a one-way issue superscalar processor is 7.5 times faster than the VLIW DSP for the same processor clock speed. The superscalar speedup from one-way to four-way issue is 2.88:1, and from four-way to 256-way issue is 2.43:1. To reduce the execution time on the C6700, we write assembly routines for sum-of-absolute-difference, interpolation, and reconstruction, and place frequently used code and data into on-chip memory. We use TI´s discrete cosine transform assembly routines. The hand optimized VLIW DSP implementation is 61/spl times/ faster than the C version compiled with level two optimization. Most of the improvement was due to the efficient placement of data and programs in memory. The hand optimized VLIW implementation is 14% faster than a 256-way superscalar implementation without hand optimizations.

Keywords

digital signal processing chips; interpolation; multiprocessing systems; program compilers; video coding; C compiler optimization; SimpleScalar superscalar; TI TMS320C6700 VLIW digital signal processor; VLIW DSP; assembly routines; baseline 11.263 video encoder; compiler; discrete cosine transform assembly routines; interpolation; sum-of-absolute-difference; superscalar implementation; superscalar processor; Assembly; Clocks; Digital signal processing; Digital signal processors; Interpolation; Optimizing compilers; Program processors; Programming profession; Runtime; VLIW;

fLanguage

English

Publisher

ieee

Conference_Titel

Signals, Systems and Computers, 2000. Conference Record of the Thirty-Fourth Asilomar Conference on

Conference_Location

Pacific Grove, CA, USA

ISSN

1058-6393

Print_ISBN

0-7803-6514-3

Type

conf

DOI

10.1109/ACSSC.2000.911272

Filename

911272