DocumentCode :
2964076
Title :
VLIW DSP vs. superscalar implementation of a baseline 11.263 video encoder
Author :
Banerjee, Serene ; Sheikh, Humid R. ; John, Lizy K. ; Evans, Brian L. ; Bovik, Alan C.
Author_Institution :
Dept. of Electr. & Comput. Eng., Texas Univ., Austin, TX, USA
Volume :
2
fYear :
2000
fDate :
Oct. 29 2000-Nov. 1 2000
Firstpage :
1665
Abstract :
A Very Long Instruction Word (VLIW) processor and a superscalar processor can execute multiple instructions simultaneously. A VLIW processor depends on the compiler and programmer to find the parallelism in the instructions, whereas a superscaler processor determines the parallelism at runtime. This paper compares TI TMS320C6700 VLIW digital signal processor (DSP) and SimpleScalar superscalar implementations of a baseline 11.263 video encoder in C. With level two C compiler optimization, a one-way issue superscalar processor is 7.5 times faster than the VLIW DSP for the same processor clock speed. The superscalar speedup from one-way to four-way issue is 2.88:1, and from four-way to 256-way issue is 2.43:1. To reduce the execution time on the C6700, we write assembly routines for sum-of-absolute-difference, interpolation, and reconstruction, and place frequently used code and data into on-chip memory. We use TI´s discrete cosine transform assembly routines. The hand optimized VLIW DSP implementation is 61/spl times/ faster than the C version compiled with level two optimization. Most of the improvement was due to the efficient placement of data and programs in memory. The hand optimized VLIW implementation is 14% faster than a 256-way superscalar implementation without hand optimizations.
Keywords :
digital signal processing chips; interpolation; multiprocessing systems; program compilers; video coding; C compiler optimization; SimpleScalar superscalar; TI TMS320C6700 VLIW digital signal processor; VLIW DSP; assembly routines; baseline 11.263 video encoder; compiler; discrete cosine transform assembly routines; interpolation; sum-of-absolute-difference; superscalar implementation; superscalar processor; Assembly; Clocks; Digital signal processing; Digital signal processors; Interpolation; Optimizing compilers; Program processors; Programming profession; Runtime; VLIW;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signals, Systems and Computers, 2000. Conference Record of the Thirty-Fourth Asilomar Conference on
Conference_Location :
Pacific Grove, CA, USA
ISSN :
1058-6393
Print_ISBN :
0-7803-6514-3
Type :
conf
DOI :
10.1109/ACSSC.2000.911272
Filename :
911272
Link To Document :
بازگشت