DocumentCode :
2998979
Title :
Optimized Reduce for Mesh-Based NoC Multiprocessors
Author :
Kohler, Adán ; Radetzki, Martin
Author_Institution :
Inst. fur Tech. Inf., Univ. Stuttgart, Stuttgart, Germany
fYear :
2012
fDate :
21-25 May 2012
Firstpage :
904
Lastpage :
913
Abstract :
Future processors are expected to be made up of a large number of computation cores interconnected by fast on-chip networks (Network-on-Chip, NoC). Such distributed structures motivate the use of message passing programming models similar to MPI. Since the properties of these networks, like e.g. the topology, are known and fixed after production, this knowledge can be used to optimize the communication stack. We describe two schemes that take advantage of this to accelerate the (All-)Reduce operation defined in MPI, namely a contention avoiding rank-to-core mapping and a way of interleaving communication and computation by means of pipelining. Simulations show that the combination of both schemes can accelerate (All-)Reduce operations by more than 60%.
Keywords :
mesh generation; message passing; multiprocessing systems; network-on-chip; pipeline processing; MPI; all-reduce operation; communication stack; computation cores; mesh-based NoC multiprocessors; message passing programming models; on-chip networks; optimized reduce; pipelining; rank-to-core mapping; Bandwidth; Computer architecture; Network topology; Program processors; System-on-a-chip; Topology; Vectors; MPI; Network-on-Chip; Reduce;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International
Conference_Location :
Shanghai
Print_ISBN :
978-1-4673-0974-5
Type :
conf
DOI :
10.1109/IPDPSW.2012.111
Filename :
6270735
Link To Document :
بازگشت