• DocumentCode
    252369
  • Title

    Modified fused multiply-accumulate chained unit

  • Author

    Nasiri, Nasibeh ; Segal, Oren ; Margala, Martin

  • Author_Institution
    ECE Dept., Univ. of Massachusetts Lowell, Lowell, MA, USA
  • fYear
    2014
  • fDate
    3-6 Aug. 2014
  • Firstpage
    889
  • Lastpage
    892
  • Abstract
    Fused multiply-add (FMA) units can reduce latency and increase energy efficiency in arithmetic operations. A modified architecture of a multiply-accumulation chained unit (MFMA) is described in this paper. The add/sub pipelined datapath of a traditional fused multiply-add unit is modified to save hardware resources, conserve energy and reduce latency in DSP applications. The proposed datapath for add/sub is flexible, generic and can be used in any IEEE-754 compatible floating point architecture as a replacement for the traditional multiply-accumulation chained unit. FMA and MFMA are both implemented in a nine-stage pipelined design. The clock limiting stage for both architectures is the normalization stage which remains unchanged in the proposed architecture. FPGA implementation for the proposed three-input add/sub and ASIC implementation for the MFMA is performed. In the FPGA implementation of the proposed add/sub datapath the area reduction is 19.56% and power reduction is 20.67% and the latency is halved compared to two cascaded two-input add/sub datapaths. In ASIC implementations of the classic FMA and MFMA the overall area reduction is 7.16% and power saving is 5.69%.
  • Keywords
    application specific integrated circuits; field programmable gate arrays; floating point arithmetic; pipeline arithmetic; ASIC; DSP; FPGA; IEEE-754 compatible floating point architecture; MFMA; add-subpipelined datapath; area reduction; arithmetic operations; cascaded two-input add-sub datapaths; clock limiting stage; energy conservation; energy efficiency; hardware resources; latency reduction; modified fused multiply-accumulate chained unit; nine-stage pipelined design; normalization stage; Application specific integrated circuits; Computer architecture; Delays; Digital signal processing; Hardware; Pipelines; Vectors; Floating point add datapath; Fused multiply-add (FMA); IEEE-754 forma; Multiply-Accumulate chained unit (MAC); Single precision floating point;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Circuits and Systems (MWSCAS), 2014 IEEE 57th International Midwest Symposium on
  • Conference_Location
    College Station, TX
  • ISSN
    1548-3746
  • Print_ISBN
    978-1-4799-4134-6
  • Type

    conf

  • DOI
    10.1109/MWSCAS.2014.6908558
  • Filename
    6908558