DocumentCode :
880264
Title :
A fully pipelined single-precision floating-point unit in the synergistic processor element of a CELL processor
Author :
Oh, Hwa-Joon ; Mueller, Silvia M. ; Jacobi, Christian ; Tran, Kevin D. ; Cottier, Scott R. ; Michael, Brad W. ; Nishikawa, Hiroo ; Totsuka, Yonetaro ; Namatame, Tatsuya ; Yano, Naoka ; Machida, Takashi ; Dhong, Sang H.
Author_Institution :
IBM Syst. & Technol. Group, Austin, TX, USA
Volume :
41
Issue :
4
fYear :
2006
fDate :
4/1/2006 12:00:00 AM
Firstpage :
759
Lastpage :
771
Abstract :
The floating-point unit (FPU) in the synergistic processor element (SPE) of a CELL processor is a fully pipelined 4-way single-instruction multiple-data (SIMD) unit designed to accelerate media and data streaming with 128-bit operands. It supports 32-bit single-precision floating-point and 16-bit integer operands with two different latencies, six-cycle and seven-cycle, with 11 FO4 delay per stage. The FPU optimizes the performance of critical single-precision multiply-add operations. Since exact rounding, exceptions, and de-norm number handling are not important to multimedia applications, IEEE correctness on the single-precision floating-point numbers is sacrificed for performance and simple design. It employs fine-grained clock gating for power saving. The design has 768K transistors in 1.3 mm2, fabricated SOI in 90-nm technology. Correct operations have been observed up to 5.6 GHz with 1.4 V and 56°C, delivering 44.8 GFlops. Architecture, logic, circuits, and integration are codesigned to meet the performance, power, and area goals.
Keywords :
floating point arithmetic; integrated circuit design; logic design; microprocessor chips; pipeline processing; silicon-on-insulator; 1.4 V; 128 bit; 16 bit; 32 bit; 56 C; 90 nm; CELL processor; SIMD unit; fine-grained clock gating; floating-point unit; microprocessor chips; single-instruction multiple-data unit; single-precision floating-point numbers; single-precision multiply-add operations; synergistic processor element; Acceleration; DH-HEMTs; Delay; Fixed-point arithmetic; Floating-point arithmetic; Jacobian matrices; Latches; Microprocessors; Registers; Streaming media; Floating-point arithmetic; integrated circuit design; microprocessors; very large-scale integration;
fLanguage :
English
Journal_Title :
Solid-State Circuits, IEEE Journal of
Publisher :
ieee
ISSN :
0018-9200
Type :
jour
DOI :
10.1109/JSSC.2006.870924
Filename :
1610620
Link To Document :
بازگشت