• DocumentCode
    2907102
  • Title

    Simulation-Based Performance Analysis and Tuning for a Two-Level Directly Connected System

  • Author

    Totoni, Ehsan ; Bhatele, Abhinav ; Bohm, Eric J. ; Jain, Nikhil ; Mendes, Celso L. ; Mokos, Ryan M. ; Zheng, Gengbin ; Kale, Laxmikant V.

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
  • fYear
    2011
  • fDate
    7-9 Dec. 2011
  • Firstpage
    340
  • Lastpage
    347
  • Abstract
    Hardware and software co-design is becoming increasingly important due to complexities in supercomputing architectures. Simulating applications before there is access to the real hardware can assist machine architects in making better design decisions that can optimize application performance. At the same time, the application and runtime can be optimized and tuned beforehand. BigSim is a simulation-based performance prediction framework designed for these purposes. It can be used to perform packet-level network simulations of parallel applications using existing parallel machines. In this paper, we demonstrate the utility of BigSim in analyzing and optimizing parallel application performance for future systems based on the PERCS network. We present simulation studies using benchmarks and real applications expected to run on future supercomputers. Future petascale systems will have more than 100,000 cores, and we present simulations at that scale.
  • Keywords
    digital simulation; hardware-software codesign; parallel machines; performance evaluation; BigSim; hardware and software codesign; machine architecture; packet level network simulations; parallel applications; parallel machines; simulation based performance analysis; simulation based performance tuning; supercomputing architectures; two level directly connected system; Network topology; Noise; Routing; Supercomputers; Three dimensional displays; Topology; Tuning; collective communication; mapping; performance prediction; simulation; system noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Systems (ICPADS), 2011 IEEE 17th International Conference on
  • Conference_Location
    Tainan
  • ISSN
    1521-9097
  • Print_ISBN
    978-1-4577-1875-5
  • Type

    conf

  • DOI
    10.1109/ICPADS.2011.121
  • Filename
    6121296