• DocumentCode
    927300
  • Title

    Algorithm and architecture for a high density, low power scalar product macrocell

  • Author

    Gu, J. ; Chang, C.-H. ; Yeo, K.-S.

  • Author_Institution
    Sch. of Electr. & Electron. Eng., Nanyang Technol. Univ., Singapore, Singapore
  • Volume
    151
  • Issue
    2
  • fYear
    2004
  • fDate
    3/19/2004 12:00:00 AM
  • Firstpage
    161
  • Lastpage
    172
  • Abstract
    The authors present a design approach for an arithmetic macrocell that computes the scalar product of two vectors, an operation ubiquitously present in the solution of many communications and digital signal processing problems. The core of the proposed architecture is a full combinational design containing a partial product generator, a partial product accumulator and a vector accumulator. The design addresses the competing optimisation goals of VLSI area, power dissipation and latency in the deep submicron regime. Compared with conventional merged arithmetic architectures, the proposed macrocell design represents a substantial improvement in the VLSI layout with little area wastage, a high degree of regularity and a good scalability for different vector lengths and operand widths. A theoretical analysis shows that the design of a 16-bit scalar product multiplier for input vectors with 16 elements, in comparison with traditionally designed architecture, achieves a saving of 38.6% in the silicon area, an up to 73% increase in the area usage efficiency and a 29.4% saving in the interconnect delay. Post-layout simulations of the proposed circuit, based on a 0.18 μm CMOS process, show an average power dissipation of 64.96 mW and a latency of 6.92 ns at a standard supply voltage of 1.8 V, a superior performance for a single cycle instruction in a high-speed, low voltage 16-bit digital signal processor operating at 144 MHz. The use of shorter interconnects and more equalised interconnect delays, leads to the power dissipation and delay incurred by the interconnects being substantially reduced. Post-layout simulation of our proposed circuit at supply voltages ranging from 0.7 to 3.3 V shows a significant power reduction of 6 to 13% over the pre-layout simulation results of the conventional design.
  • Keywords
    VLSI; circuit optimisation; circuit simulation; delay estimation; digital arithmetic; digital signal processing chips; integrated circuit design; integrated logic circuits; logic simulation; multiplying circuits; 16-bit scalar product multiplier; CMOS process; VLSI area; arithmetic macrocell design; combinational design; deep submicron regime; digital signal processing problems; interconnect delays; merged arithmetic architectures; optimisation goals; partial product accumulator; partial product generator; postlayout simulations; power dissipation; prelayout simulation; single cycle instruction; supply voltages; vector accumulator;
  • fLanguage
    English
  • Journal_Title
    Computers and Digital Techniques, IEE Proceedings -
  • Publisher
    iet
  • ISSN
    1350-2387
  • Type

    jour

  • DOI
    10.1049/ip-cdt:20040328
  • Filename
    1274033