• DocumentCode
    2954789
  • Title

    A method for communication efficient work distributions in stencil operation based applications on heterogeneous clusters

  • Author

    Schneible, Joseph ; Riha, Lubomir ; Malik, Maria ; El-Ghazawi, Tarek ; Alexandru, Andrei

  • Author_Institution
    Dept. of Electr. & Comput. Eng., George Washington Univ., Washington, DC, USA
  • fYear
    2012
  • fDate
    2-6 July 2012
  • Firstpage
    468
  • Lastpage
    474
  • Abstract
    In recent years, the use of accelerators in conjunction with CPUs, known as heterogeneous computing, has brought about significant performance increases for scientific applications. One of the best examples of this is Lattice Quantum Chromo-Dynamics (QCD), a stencil operation based simulation. These simulations have a large memory footprint necessitating the use of many graphics processing units (GPUs) in parallel. This requires the use of a heterogeneous cluster with one or more GPUs per node. In order to obtain optimal performance, it is necessary to determine an efficient communication pattern between GPUs on the same node and between nodes. In this paper we present a performance model based method for minimizing the communication time of applications with stencil operations, such as Lattice QCD, on heterogeneous computing systems with a non-blocking Infiniband interconnection network. The proposed method is able to increase the performance of the most computationally intensive kernel of Lattice QCD by 25 percent due to improved overlapping of communication and computation.
  • Keywords
    graphics processing units; simulation; CPU; GPU; QCD; communication pattern; graphics processing units; heterogeneous clusters; heterogeneous computing; infiniband interconnection network; lattice quantum chromo-dynamics; stencil operation based simulation; work distributions; Bandwidth; Benchmark testing; Computational modeling; Computer architecture; Graphics processing unit; Lattices; Switches; GPU acceleration; Heterogeneous computing; Lattice QCD; nearest neighbor; performance model; stencil operation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing and Simulation (HPCS), 2012 International Conference on
  • Conference_Location
    Madrid
  • Print_ISBN
    978-1-4673-2359-8
  • Type

    conf

  • DOI
    10.1109/HPCSim.2012.6266960
  • Filename
    6266960