DocumentCode :
3401953
Title :
Design of a fast inner product processor
Author :
Smith, S.P. ; Torng, H.C.
Author_Institution :
School of Electrical Engineering, Cornell University, Ithaca, New York 14853
fYear :
1985
fDate :
4-6 June 1985
Firstpage :
38
Lastpage :
43
Abstract :
This paper presents the design of a fast inner product processor, with appreciably reduced latency and cost The inner product processor is implemented with a tree of carry propagate or carry save adders; this tree is obtained with the incorporation of three innovations in the conventional multiply/add tree: (1) The leaf-multipliers are expanded into adder subtress, thus achieving an O(logNb) latency, where N denotes the number of elements in a vector and b the number of bits in each element (2) The partial products, to be summed in producing an inner product, are reordered according to their "minimum alignments", bringing approximately a 20% saving in hardware. (3) The reordering also truncates the carry propagation chain in the final propagation stage by 2 log b − 1 positions, significantly reducing the latency further. A form of the Baugh and Wooley algorithm is adopted to implement two\´s complement notation with changes only in peripheral hardware.
Keywords :
Adders; Binary trees; Delay; Hardware; Pipeline processing; Throughput; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Arithmetic (ARITH), 1985 IEEE 7th Symposium on
Conference_Location :
Urbana, IL,
Type :
conf
DOI :
10.1109/ARITH.1985.6158974
Filename :
6158974
Link To Document :
بازگشت