DocumentCode :
1814743
Title :
Evaluate and optimize parallel Barnes-Hut algorithm for emerging many-core architectures
Author :
Xu, Thomas Canhao ; Liljeberg, Pasi ; Plosila, Juha ; Tenhunen, Hannu
Author_Institution :
Dept. of Inf. Technol., Univ. of Turku, Turku, Finland
fYear :
2013
fDate :
1-5 July 2013
Firstpage :
421
Lastpage :
428
Abstract :
This paper focuses on the use of Network-on-Chip (NoC) accelerators for Barnes-Hut N-Body simulations. NoC-based architecture is proposed to solve the communication bottleneck of processors with hundreds or even thousands of cores. An N-body simulation approximates the evolution of a system of bodies, e.g. an astrophysical system where each body represents a star or a galaxy. Despite the fact that the behaviour of Barnes-Hut algorithm has been studied on conventional multicore systems, graphics processing units and other accelerators, we explore key performance issues in the context of NoC platform. We investigate serial and parallel implementations, where the parallel version is analyzed in terms of network traffic. The results revealed that hot-spot and bursty traffic can congest the network, while long distance communication deteriorated system performance further. We propose algorithmic and interconnection optimizations. These include improved data locality, proper mapping and partially diagonal network. Evaluation results show that, compared with the original implementation, the average execution time and energy delay product are reduced by 25.3% and 31.6% respectively. The proposed design achieved 55.4× speed-up over 64 threads.
Keywords :
multiprocessing systems; multiprocessor interconnection networks; network-on-chip; parallel algorithms; Barnes-Hut N-body simulations; NoC accelerators; NoC-based architecture; algorithmic optimization; average execution time reduction; bursty traffic; data locality improvement; energy delay product reduction; hot-spot; interconnection optimization; long distance communication; many-core architectures; network congestion; network traffic; network-on-chip accelerators; parallel Barnes-Hut algorithm evaluation; parallel Barnes-Hut algorithm optimization; parallel implementation; partially-diagonal network; performance issues; serial implementation; Complexity theory; Computational modeling; Multicore processing; Network-on-chip; Program processors; Tiles; Barnes-Hut; Interconnection; Many-core; Mapping; Network-on-Chip; Parallel System;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing and Simulation (HPCS), 2013 International Conference on
Conference_Location :
Helsinki
Print_ISBN :
978-1-4799-0836-3
Type :
conf
DOI :
10.1109/HPCSim.2013.6641449
Filename :
6641449
Link To Document :
بازگشت