Title :
Fused Multiply-Add Microarchitecture Comprising Separate Early-Normalizing Multiply and Add Pipelines
Author_Institution :
ARM, Austin, TX, USA
Abstract :
We present an IEEE 754-2008 and ARM compliant floating-point micro architecture that preserves the higher performance of separate multiply and add units while decreasing the effective latency of fused multiply-adds (FMAs). The multiplier supports subnormals in a novel and faster manner, shifting the partial products so that injection rounding can be used. The early-normalizing adder retains the low latency of a split path near/far adder, but does so in a unified path with less area. The adder also allows rounding on effective subtractions involving one input that is twice the normal width, a necessary feature for handling FMAs. The resulting floating-point unit has about twice the (IPC) performance of the best previous ARM design, and can be clocked at a higher speed despite the wider paths required by FMAs.
Keywords :
IEEE standards; adders; computer architecture; floating point arithmetic; ARM compliant floating point microarchitecture; IEEE 754-2008; early normalizing adder; early normalizing multiply; multiply add microarchitecture; partial product; Adders; Arrays; Benchmark testing; Geometry; Lighting; Microarchitecture; Pipelines; floating-point; fused multiply add; microarchitecture;
Conference_Titel :
Computer Arithmetic (ARITH), 2011 20th IEEE Symposium on
Conference_Location :
Tubingen
Print_ISBN :
978-1-4244-9457-6
DOI :
10.1109/ARITH.2011.25