• DocumentCode
    1783303
  • Title

    Balancing CPU-GPU Collaborative High-Order CFD Simulations on the Tianhe-1A Supercomputer

  • Author

    Chuanfu Xu ; Lilun Zhang ; Xiaogang Deng ; Jianbin Fang ; Guangxue Wang ; Wei Cao ; Yonggang Che ; Yongxian Wang ; Wei Liu

  • Author_Institution
    Coll. of Comput. Sci., Nat. Univ. of Defense Technol., Changsha, China
  • fYear
    2014
  • fDate
    19-23 May 2014
  • Firstpage
    725
  • Lastpage
    734
  • Abstract
    HOSTA is an in-house high-order CFD software that can simulate complex flows with complex geometries. Large scale high-order CFD simulations using HOSTA require massive HPC resources, thus motivating us to port it onto modern GPU accelerated supercomputers like Tianhe-1A. To achieve a greater speedup and fully tap the potential of Tianhe-1A, we collaborate CPU and GPU for HOSTA instead of using a naive GPU-only approach. We present multiple novel techniques to balance the loads between the store-poor GPU and the store-rich CPU, and overlap the collaborative computation and communication as far as possible. Taking CPU and GPU load balance into account, we improve the maximum simulation problem size per Tianhe-1A node for HOSTA by 2.3X, meanwhile the collaborative approach can improve the performance by around 45% compared to the GPU-only approach. Scalability tests show that HOSTA can achieve a parallel efficiency of above 60% on 1024 Tianhe-1A nodes. With our method, we have successfully simulated China´s large civil airplane configuration C919 containing 150M grid cells. To our best knowledge, this is the first paper that reports a CPUGPU collaborative high-order accurate aerodynamic simulation result with such a complex grid geometry.
  • Keywords
    computational fluid dynamics; flow simulation; graphics processing units; parallel machines; resource allocation; CPU-GPU collaborative high-order CFD simulations; CPU-GPU collaborative high-order accurate aerodynamic simulation; GPU accelerated supercomputers; HOSTA; Tianhe-1A supercomputer; complex grid geometry; in-house high-order CFD software; large scale high-order CFD simulations; load balancing; massive HPC resources; maximum simulation problem size per Tianhe-1A node; naive GPU-only approach; simulated China large civil airplane configuration C919; store-poor GPU; store-rich CPU; Collaboration; Computational fluid dynamics; Computational modeling; Graphics processing units; Kernel; Memory management; Performance evaluation; CFD; CPU-GPU collaboration; GPU parallelization; high-order finite difference scheme;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium, 2014 IEEE 28th International
  • Conference_Location
    Phoenix, AZ
  • ISSN
    1530-2075
  • Print_ISBN
    978-1-4799-3799-8
  • Type

    conf

  • DOI
    10.1109/IPDPS.2014.80
  • Filename
    6877304