DocumentCode
16180
Title
Efficient All-to-All Broadcast in Gaussian On-Chip Networks
Author
Zhemin Zhang ; Zhiyang Guo ; Yuanyuan Yang
Author_Institution
Dept. of Electr. & Comput. Eng., Stony Brook Univ., Stony Brook, NY, USA
Volume
62
Issue
10
fYear
2013
fDate
Oct. 2013
Firstpage
1959
Lastpage
1971
Abstract
With the development of multiprocessor system on chips (MPSoCs), it is expected that hundreds of computing cores will be operating on a single chip in the near future. This will require high-performance on-chip networks with very low latency to provide a communication substrate for the increasing number of cores. In this paper, we consider Gaussian on-chip networks that are of significant topological advantages over traditional mesh and torus networks in terms of diameter and average hop distance. Many applications on MPSoCs need global data movement and global control to exchange data and synchronize the execution among cores, which require all-to-all broadcast communication. In this paper, we propose an all-to-all broadcast algorithm suitable for on-chip implementation on the Gaussian network topology. The algorithm utilizes controlled message flooding based on a broadcast pattern, which can be described in a formal, generic way for each node in terms of a few simple operations and can be easily built into router hardware. Furthermore, the generic broadcast pattern also ensures a balanced traffic load in all dimensions in the network so that minimum total latency for all-to-all broadcast can be achieved. The algorithm overlaps message switching time with transmission time in a pipelined fashion to further reduce the total communication latency of all-to-all broadcast. Comparison results demonstrate the topological merits of Gaussian networks and ultralow latency of the proposed all-to-all broadcast algorithm.
Keywords
Gaussian processes; microprocessor chips; multiprocessing systems; system-on-chip; Gaussian network topology; Gaussian on chip networks; MPSoC; all to all broadcast algorithm; all to all broadcast communication; average hop distance; balanced traffic load; computing cores; controlled message flooding; generic broadcast pattern; high performance on chip networks; message switching time; multiprocessor system on chips; router hardware; torus networks; total communication latency; transmission time; ultralow latency; Algorithm design and analysis; Clocks; Delay; Network topology; Routing; System-on-a-chip; Topology; Gaussian network; Network on chips; all-to-all broadcasting; hardware-based; multiprocessor system on chip; pipeline; routing;
fLanguage
English
Journal_Title
Computers, IEEE Transactions on
Publisher
ieee
ISSN
0018-9340
Type
jour
DOI
10.1109/TC.2012.126
Filename
6212460
Link To Document