• DocumentCode
    679642
  • Title

    A Study of the Effect of Partitioning on Parallel Simulation of Multicore Systems

  • Author

    Zhenjiang Dong ; Jun Wang ; Riley, George F. ; Yalamanchili, Sudhakar

  • Author_Institution
    Sch. of Electr. & Comput. Eng., Georgia Inst. of Technol., Atlanta, GA, USA
  • fYear
    2013
  • fDate
    14-16 Aug. 2013
  • Firstpage
    375
  • Lastpage
    379
  • Abstract
    There has been little research that studies the effect of partitioning on parallel simulation of multicore systems. This paper presents our study of this important problem in the context of Null-message-based synchronization algorithm for parallel multicore simulation. This paper focuses on coarse grain parallel simulation where each core and its cache slices are modeled within a single logical process (LP) and different partitioning schemes are only applied to the interconnection network. In this paper we show that encapsulating the entire on-chip interconnection network into a single logical process is an impediment to scalable simulation. This baseline partitioning and two other schemes are investigated. Experiments are conducted on a subset of the PARSEC benchmarks with 16-, 32-, 64- and 128-core models. Results show that the partitioning scheme has a significant impact on simulation performance and parallel efficiency. Beyond a certain system scale, one scheme consistently outperforms the other two schemes, and the performance as well as efficiency gaps increases as the size of the model increases - with up to 4.1 times faster speed and 277% better efficiency for 128-core models. We explain the reasons for this behavior, which can be traced to the features of the Null-message-based synchronization algorithm. Because of this, we believe that, if a component has increasing number of inter-LP interactions with increasing system size, such components should be partitioned into several sub-components to achieve better performance.
  • Keywords
    cache storage; discrete event simulation; multiprocessing systems; multiprocessor interconnection networks; parallel architectures; performance evaluation; synchronisation; 128-core model; 16-core model; 32-core model; 64-core model; PARSEC benchmarks; baseline partitioning; cache slices; coarse grain parallel simulation; interLP interactions; logical process; null-message-based synchronization algorithm; on-chip interconnection network; parallel efficiency; parallel multicore simulation; Benchmark testing; Computational modeling; Manifolds; Multicore processing; Multiprocessor interconnection; Partitioning algorithms; Synchronization; multicore system; null message algorithm; parallel simulation; partitioning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS), 2013 IEEE 21st International Symposium on
  • Conference_Location
    San Francisco, CA
  • ISSN
    1526-7539
  • Type

    conf

  • DOI
    10.1109/MASCOTS.2013.55
  • Filename
    6730790