DocumentCode :
1914038
Title :
Modeling a Million-Node Dragonfly Network Using Massively Parallel Discrete-Event Simulation
Author :
Mubarak, Misbah ; Carothers, Christopher D. ; Ross, Robert ; Carns, Philip
Author_Institution :
Comput. Sci. Dept., Rensselaer Polytech. Inst., Troy, NY, USA
fYear :
2012
fDate :
10-16 Nov. 2012
Firstpage :
366
Lastpage :
376
Abstract :
A low-latency and low-diameter interconnection network will be an important component of future exascale architectures. The dragonfly network topology, a two-level directly connected network, is a candidate for exascale architectures because of its low diameter and reduced latency. To date, small-scale simulations with a few thousand nodes have been carried out to examine the dragonfly topology. However, future exascale machines will have millions of cores and up to 1 million nodes. In this paper, we focus on the modeling and simulation of large-scale dragonfly networks using the Rensselaer Optimistic Simulation System (ROSS). We validate the results of our model against the cycle-accurate simulator “booksim”. We also compare the performance of booksim and ROSS for the dragonfly network model at modest scales. We demonstrate the performance of ROSS on both the Blue Gene/P and Blue Gene/Q systems on a dragonfly model with up to 50 million nodes, showing a peak event rate of 1.33 billion events/second and a total of 872 billion committed events. The dragonfly network model for million-node configurations strongly scales when going from 1,024 to 65,536 MPI tasks on IBM Blue Gene/P and IBM Blue Gene/Q systems. We also explore a variety of ROSS tuning parameters to get optimal results with the dragonfly network model.
Keywords :
computer network performance evaluation; discrete event simulation; message passing; multiprocessor interconnection networks; parallel architectures; telecommunication network topology; IBM Blue Gene/P system; IBM Blue Gene/Q system; ROSS tuning parameters; Rensselaer Optimistic Simulation System; booksim cycle-accurate simulator; dragonfly network topology; exascale architectures; exascale machines; large-scale dragonfly network modeling; large-scale dragonfly network simulation; latency reduction; low-latency low-diameter interconnection network; massively parallel discrete-event simulation; million-node dragonfly network modeling; peak event rate; performance evaluation; two-level directly connected network; ROSS; dragonfly; parallel discrete event simulation; routing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion:
Conference_Location :
Salt Lake City, UT
Print_ISBN :
978-1-4673-6218-4
Type :
conf
DOI :
10.1109/SC.Companion.2012.56
Filename :
6495838
Link To Document :
بازگشت