Title :
High-performance floating point divide
Author :
Liddicoat, Albert A. ; Flynn, Michael J.
Author_Institution :
Comput. Syst. Lab., Stanford Univ., CA, USA
Abstract :
In modern processors floating point divide operations often take 20 to 25 clock cycles, five times that of multiplication. Typically multiplicative algorithms with quadratic convergence are used for high-performance divide. A divide unit based on the multiplicative Newton-Raphson iteration is proposed. This divide unit utilizes the higher-order Newton-Raphson reciprocal approximation to compute the quotient fast, efficiently and with high throughput. The divide unit achieves fast execution by computing the square, cube and higher powers of the approximation directly and much faster than the traditional approach with serial multiplications. Additionally, the second, third and higher-order terms are computed simultaneously further reducing the divide latency. Significant hardware reductions have been identified that reduce the overall computation significantly and therefore, reduce the area required for implementation and the power consumed by the computation. The proposed hardware unit is designed to achieve the desired quotient precision in a single iteration allowing the unit to be fully pipelined for maximum throughput
Keywords :
Newton-Raphson method; floating point arithmetic; pipeline arithmetic; hardware reductions; high-performance floating point divide; higher-order Newton-Raphson reciprocal approximation; multiplicative Newton-Raphson iteration; quadratic convergence; Acceleration; Clocks; Concurrent computing; Convergence; Delay; Hardware; Laboratories; Parallel processing; Taylor series; Throughput;
Conference_Titel :
Digital Systems Design, 2001. Proceedings. Euromicro Symposium on
Conference_Location :
Warsaw
Print_ISBN :
0-7695-1239-9
DOI :
10.1109/DSD.2001.952327