Title :
Modified fused multiply-accumulate chained unit
Author :
Nasiri, Nasibeh ; Segal, Oren ; Margala, Martin
Author_Institution :
ECE Dept., Univ. of Massachusetts Lowell, Lowell, MA, USA
Abstract :
Fused multiply-add (FMA) units can reduce latency and increase energy efficiency in arithmetic operations. A modified architecture of a multiply-accumulation chained unit (MFMA) is described in this paper. The add/sub pipelined datapath of a traditional fused multiply-add unit is modified to save hardware resources, conserve energy and reduce latency in DSP applications. The proposed datapath for add/sub is flexible, generic and can be used in any IEEE-754 compatible floating point architecture as a replacement for the traditional multiply-accumulation chained unit. FMA and MFMA are both implemented in a nine-stage pipelined design. The clock limiting stage for both architectures is the normalization stage which remains unchanged in the proposed architecture. FPGA implementation for the proposed three-input add/sub and ASIC implementation for the MFMA is performed. In the FPGA implementation of the proposed add/sub datapath the area reduction is 19.56% and power reduction is 20.67% and the latency is halved compared to two cascaded two-input add/sub datapaths. In ASIC implementations of the classic FMA and MFMA the overall area reduction is 7.16% and power saving is 5.69%.
Keywords :
application specific integrated circuits; field programmable gate arrays; floating point arithmetic; pipeline arithmetic; ASIC; DSP; FPGA; IEEE-754 compatible floating point architecture; MFMA; add-subpipelined datapath; area reduction; arithmetic operations; cascaded two-input add-sub datapaths; clock limiting stage; energy conservation; energy efficiency; hardware resources; latency reduction; modified fused multiply-accumulate chained unit; nine-stage pipelined design; normalization stage; Application specific integrated circuits; Computer architecture; Delays; Digital signal processing; Hardware; Pipelines; Vectors; Floating point add datapath; Fused multiply-add (FMA); IEEE-754 forma; Multiply-Accumulate chained unit (MAC); Single precision floating point;
Conference_Titel :
Circuits and Systems (MWSCAS), 2014 IEEE 57th International Midwest Symposium on
Conference_Location :
College Station, TX
Print_ISBN :
978-1-4799-4134-6
DOI :
10.1109/MWSCAS.2014.6908558