• DocumentCode
    2959411
  • Title

    Improving Parallel IO Performance of Cell-based AMR Cosmology Applications

  • Author

    Yu, Yongen ; Rudd, Douglas H. ; Lan, Zhiling ; Gnedin, Nickolay Y. ; Kravtsov, Andrey ; Wu, Jingjin

  • Author_Institution
    Dept. of Comput. Sci., Illinois Inst. of Technol., Chicago, IL, USA
  • fYear
    2012
  • fDate
    21-25 May 2012
  • Firstpage
    933
  • Lastpage
    944
  • Abstract
    To effectively model various regions with different resolutions, adaptive mesh refinement (AMR) is commonly used in cosmology simulations. There are two well-known numerical approaches towards the implementation of AMR based cosmology simulations: block-based AMR and cell-based AMR. While many studies have been conducted to improve performance and scalability of block-structured AMR applications, little work has been done for cell-based simulations. In this study, we present a parallel IO design for cell-based AMR cosmology applications, in particular, the ART(Adaptive Refinement Tree) code. First, we design a new data format that incorporates a space filling curve to map between spatial and on-disk locations. This indexing not only enables concurrent IO accesses from multiple application processes, but also allows users to extract local regions without significant additional memory, CPU or disk space overheads. Second, we develop a flexible N-M mapping mechanism to harvest the benefits of N-N and N-1 mappings where N is number of application processes and M is a user-tunable parameter for number of files. It not only overcomes the limited bandwidth issue of an N-1 mapping by allowing the creation of multiple files, but also enables users to efficiently restart the application at a variety of computing scales. Third, we develop a user-level library to transparently and automatically aggregate small IO accesses per process to accelerate IO performance. We evaluate this new parallel IO design by means of real cosmology simulations on production HPC system at TACC. Our preliminary results indicate that it can not only provide the functionality required by scientists (e.g., effective extraction of local regions and flexible process-to file mapping), but also significantly improve IO performance.
  • Keywords
    astronomy computing; concurrency control; cosmology; input-output programs; mesh generation; parallel processing; software libraries; AMR based cosmology simulation; ART; HPC system; N-1 mapping; N-N mapping; TACC; adaptive mesh refinement; adaptive refinement tree code; block-based AMR; block-structured AMR application; cell-based AMR cosmology application; cell-based simulation; concurrent IO access; data format; flexible N-M mapping mechanism; on-disk location; parallel IO design; parallel IO performance; process-to file mapping; space filling curve; spatial location; user-level library; user-tunable parameter; Adaptation models; Computational modeling; Data models; Indexes; Layout; Numerical models; Subspace constraints; adaptive mesh refinement; cosmology simulations; data layout; high performance computing; parallel I/O;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel & Distributed Processing Symposium (IPDPS), 2012 IEEE 26th International
  • Conference_Location
    Shanghai
  • ISSN
    1530-2075
  • Print_ISBN
    978-1-4673-0975-2
  • Type

    conf

  • DOI
    10.1109/IPDPS.2012.88
  • Filename
    6267900