Title :
A GALS many-core heterogeneous DSP platform with source-synchronous on-chip interconnection network
Author :
Tran, Anh T. ; Truong, Dean N. ; Baas, Bevan M.
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of California, Davis, CA
Abstract :
This paper presents a many-core heterogeneous computational platform that employs a GALS compatible circuit-switched on-chip network. The platform targets streaming DSP and embedded applications that have a high degree of task-level parallelism among computational kernels. The test chip was fabricated in 65nm CMOS consisting of 164 simple small programmable cores, three dedicated-purpose accelerators and three shared memory modules. All processors are clocked by their own local oscillators and communication is achieved through a simple yet effective source-synchronous communication technique that allows each interconnection link between any two processors to sustain a peak throughput of one data word per cycle. A complete 802.11a WLAN baseband receiver was implemented on this platform. It has a real-time throughput of 54 Mbps with all processors running at 594 MHz and 0.95 V, and consumes an average 174.76 mW with 12.18 mW (or 7.0%) dissipated by its interconnection links. We can fully utilize the benefit of the GALS architecture and by adjusting each processor´s oscillator to run at a workload-based optimal clock frequency with the chip´s dual supply voltages set at 0.95 V and 0.75 V, the receiver consumes only 123.18 mW, a 29.5% in power reduction. Measured results of its power consumption on the real chip come within the difference of only 2-5% compared with the estimated results showing our design to be highly reliable and efficient.
Keywords :
CMOS integrated circuits; microprocessor chips; multiprocessor interconnection networks; telecommunication links; wireless LAN; -synchronous on-chip interconnection network; 802.11a WLAN baseband receiver; CMOS; DSP platform; GALS many-core heterogeneous; bit rate 54 Mbit/s; dedicated-purpose accelerators; frequency 594 MHz; interconnection link; interconnection links; local oscillators; power 12.18 mW; power 123.18 mW; power 174.76 mW; processors; shared memory modules; size 65 nm; small programmable cores; source-synchronous communication technique; task-level parallelism; voltage 0.75 V; voltage 0.95 V; workload-based optimal clock frequency; Clocks; Computer networks; Concurrent computing; Digital signal processing; Digital signal processing chips; Integrated circuit interconnections; Multiprocessor interconnection networks; Network-on-a-chip; Parallel processing; Throughput;
Conference_Titel :
Networks-on-Chip, 2009. NoCS 2009. 3rd ACM/IEEE International Symposium on
Conference_Location :
San Diego, CA
Print_ISBN :
978-1-4244-4142-6
Electronic_ISBN :
978-1-4244-4143-3
DOI :
10.1109/NOCS.2009.5071470