• DocumentCode
    725414
  • Title

    DART-CUDA: A PGAS Runtime System for Multi-GPU Systems

  • Author

    Lei Zhou ; Fuerlinger, Karl

  • Author_Institution
    Dept. of Comput. Sci., Ludwig-Maximilians-Univ. (LMU) Munchen, Munich, Germany
  • fYear
    2015
  • fDate
    June 29 2015-July 2 2015
  • Firstpage
    110
  • Lastpage
    119
  • Abstract
    The Partitioned Global Address Space (PGAS) approach is a promising programming model in high performance parallel computing that combines the advantages of distributed memory systems and shared memory systems. The PGAS model has been used on a variety of hardware platforms in the form of PGAS programming languages like Unified Parallel C (UPC), Chapel and Fortress. However, in spite of the increasing adoption in distributed and shared memory systems, the extension of the PGAS model to accelerator platforms is still not well supported. To exploit the immense computational power of multi-GPU systems, this work is concerned with the design and implementation of a Partitioned Global Address Space model for multi-GPU systems. Several issues related to the combination of logically separate GPU memories on multiple graphic cards are addressed. Furthermore, the execution model of modern GPU architectures is studied and a task creation mechanism with load balancing is proposed. Our work is implemented in the context of the DASH project, a C++ template library that realizes PGAS semantics through operator overloading. Experimental results suggest promising performance of the design and its implementation.
  • Keywords
    C++ language; distributed memory systems; graphics processing units; parallel processing; resource allocation; software libraries; C++ template library; Chapel; DART-CUDA; DASH project; GPU architecture; GPU memories; PGAS approach; PGAS programming language; PGAS runtime system; UPC; distributed memory system; high performance parallel computing; load balancing; multiGPU system; multiple graphic card; partitioned global address space approach; shared memory system; task creation mechanism; unified parallel C; Computational modeling; Electronics packaging; Graphics processing units; Kernel; Programming; Resource management; Runtime; CUDA; Heterogeneous computing; MultiGPU systems; PGAS; Partitioned Global Address Space;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Computing (ISPDC), 2015 14th International Symposium on
  • Conference_Location
    Limassol
  • Print_ISBN
    978-1-4673-7147-6
  • Type

    conf

  • DOI
    10.1109/ISPDC.2015.20
  • Filename
    7165137