DocumentCode :
679642
Title :
A Study of the Effect of Partitioning on Parallel Simulation of Multicore Systems
Author :
Zhenjiang Dong ; Jun Wang ; Riley, George F. ; Yalamanchili, Sudhakar
Author_Institution :
Sch. of Electr. & Comput. Eng., Georgia Inst. of Technol., Atlanta, GA, USA
fYear :
2013
fDate :
14-16 Aug. 2013
Firstpage :
375
Lastpage :
379
Abstract :
There has been little research that studies the effect of partitioning on parallel simulation of multicore systems. This paper presents our study of this important problem in the context of Null-message-based synchronization algorithm for parallel multicore simulation. This paper focuses on coarse grain parallel simulation where each core and its cache slices are modeled within a single logical process (LP) and different partitioning schemes are only applied to the interconnection network. In this paper we show that encapsulating the entire on-chip interconnection network into a single logical process is an impediment to scalable simulation. This baseline partitioning and two other schemes are investigated. Experiments are conducted on a subset of the PARSEC benchmarks with 16-, 32-, 64- and 128-core models. Results show that the partitioning scheme has a significant impact on simulation performance and parallel efficiency. Beyond a certain system scale, one scheme consistently outperforms the other two schemes, and the performance as well as efficiency gaps increases as the size of the model increases - with up to 4.1 times faster speed and 277% better efficiency for 128-core models. We explain the reasons for this behavior, which can be traced to the features of the Null-message-based synchronization algorithm. Because of this, we believe that, if a component has increasing number of inter-LP interactions with increasing system size, such components should be partitioned into several sub-components to achieve better performance.
Keywords :
cache storage; discrete event simulation; multiprocessing systems; multiprocessor interconnection networks; parallel architectures; performance evaluation; synchronisation; 128-core model; 16-core model; 32-core model; 64-core model; PARSEC benchmarks; baseline partitioning; cache slices; coarse grain parallel simulation; interLP interactions; logical process; null-message-based synchronization algorithm; on-chip interconnection network; parallel efficiency; parallel multicore simulation; Benchmark testing; Computational modeling; Manifolds; Multicore processing; Multiprocessor interconnection; Partitioning algorithms; Synchronization; multicore system; null message algorithm; parallel simulation; partitioning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS), 2013 IEEE 21st International Symposium on
Conference_Location :
San Francisco, CA
ISSN :
1526-7539
Type :
conf
DOI :
10.1109/MASCOTS.2013.55
Filename :
6730790
Link To Document :
بازگشت