DocumentCode :
839439
Title :
NEDA: a low-power high-performance DCT architecture
Author :
Shams, Ahmed M. ; Chidanandan, Archana ; Pan, Wendi ; Bayoumi, Magdy A.
Author_Institution :
Center for Adv. Comput. Studies, Univ. of Louisiana, Lafayette, LA, USA
Volume :
54
Issue :
3
fYear :
2006
fDate :
3/1/2006 12:00:00 AM
Firstpage :
955
Lastpage :
964
Abstract :
Conventional distributed arithmetic (DA) is popular in application-specific integrated circuit (ASIC) design, and it features on-chip ROM to achieve high speed and regularity. In this paper, a new DA architecture called NEDA is proposed, aimed at reducing the cost metrics of power and area while maintaining high speed and accuracy in digital signal processing (DSP) applications. Mathematical analysis proves that DA can implement inner product of vectors in the form of two´s complement numbers using only additions, followed by a small number of shifts at the final stage. Comparative studies show that NEDA outperforms widely used approaches such as multiply/accumulate (MAC) and DA in many aspects. Being a high-speed architecture free of ROM, multiplication, and subtraction, NEDA can also expose the redundancy existing in the adder array consisting of entries of 0 and 1. A hardware compression scheme is introduced to generate a butterfly structure with minimum number of additions. NEDA-based architectures for 8 × 8 discrete cosine transform (DCT) core are presented as an example. Savings exceeding 88% are achieved, when the compression scheme is applied along with NEDA. Finite word-length simulations demonstrate the viability and excellent performance of NEDA.
Keywords :
adders; digital signal processing chips; discrete cosine transforms; distributed arithmetic; DCT; DSP applications; adder array; application-specific integrated circuit design; digital signal processing; discrete cosine transform; distributed arithmetic; finite word-length simulations; hardware compression scheme; mathematical analysis; multiple-accumulate; onchip ROM; Adders; Application specific integrated circuits; Arithmetic; Costs; Digital signal processing; Digital signal processing chips; Discrete cosine transforms; Hardware; Read only memory; Signal processing algorithms; Discrete cosine transform (DCT); ROM-free multiplication; distributed arithmetic; inner product;
fLanguage :
English
Journal_Title :
Signal Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1053-587X
Type :
jour
DOI :
10.1109/TSP.2005.862755
Filename :
1597561
Link To Document :
بازگشت