• DocumentCode
    1399294
  • Title

    A High-Speed, Energy-Efficient Two-Cycle Multiply-Accumulate (MAC) Architecture and Its Application to a Double-Throughput MAC Unit

  • Author

    Hoang, Tung Thanh ; Själander, Magnus ; Larsson-Edefors, Per

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Chalmers Univ. of Technol., Gothenburg, Sweden
  • Volume
    57
  • Issue
    12
  • fYear
    2010
  • Firstpage
    3073
  • Lastpage
    3081
  • Abstract
    We propose a high-speed and energy-efficient two-cycle multiply-accumulate (MAC) architecture that supports two´s complement numbers, and includes accumulation guard bits and saturation circuitry. The first MAC pipeline stage contains only partial-product generation circuitry and a reduction tree, while the second stage, thanks to a special sign-extension solution, implements all other functionality. Place-and-route evaluations using a 65-nm 1.1-V cell library show that the proposed architecture offers a 31% improvement in speed and a 32% reduction in energy per operation, averaged across operand sizes of 16, 32, 48, and 64 bits, over a reference two-cycle MAC architecture that employs a multiplier in the first stage and an accumulator in the second. When operating the proposed architecture at the lower frequency of the reference architecture the available timing slack can be used to downsize gates, resulting in a 52% reduction in energy compared to the reference. We extend the new architecture to create a versatile double-throughput MAC (DTMAC) unit that efficiently performs either multiply-accumulate or multiply operations for N-bit, 1 × N/2-bit, or 2 × N/2-bit operands. In comparison to a fixed-function 32-bit MAC unit, 16-bit multiply-accumulate operations can be executed with 67% higher energy efficiency on a 32-bit DTMAC unit.
  • Keywords
    arithmetic codes; multiplying circuits; pipeline arithmetic; accumulation guard bits; double-throughput MAC unit; multiply operations; multiply-accumulate architecture; partial-product generation circuitry; place-and-route evaluations; saturation circuitry; size 65 nm; voltage 1.1 V; word length 16 bit; word length 32 bit; word length 48 bit; word length 64 bit; Adders; Computer architecture; Delay; Energy efficiency; Logic gates; Pipelines; Arithmetic circuits; energy efficient; high speed; multiply-accumulate unit; variable wordlength;
  • fLanguage
    English
  • Journal_Title
    Circuits and Systems I: Regular Papers, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1549-8328
  • Type

    jour

  • DOI
    10.1109/TCSI.2010.2091191
  • Filename
    5661880