• DocumentCode
    3199134
  • Title

    A Scheduling and Runtime Framework for a Cluster of Heterogeneous Machines with Multiple Accelerators

  • Author

    Beri, Tarun ; Bansal, Sorav ; Kumar, Subodh

  • Author_Institution
    Indian Inst. of Technol. Delhi, New Delhi, India
  • fYear
    2015
  • fDate
    25-29 May 2015
  • Firstpage
    146
  • Lastpage
    155
  • Abstract
    We present a runtime system for simple and efficient programming of CPU+GPU clusters. The programmer focuses on core logic, while the system undertakes task allocation, load balancing, scheduling, data transfer, etc. Our programming model is based on a shared global address space, made efficient by transaction style bulk-synchronous semantics. This model broadly targets coarse-grained data parallel computation particularly suited to multi-GPU heterogeneous clusters. We describe our computation and communication scheduling system and report its performance ona few prototype applications. For example, parallelization of matrix multiplication or 2D FFT using our system requires the regular CPU/GPU implementations and about 30 lines of additional C code to set up the runtime. Our runtime system achieves a performance of 5.61 TFlop/s while multiplying two square matrices of 1.56 billion elements each over a 10-nodecluster with 20 GPUs. This performance is possible due toa number of critical optimizations working in concert. These include perfecting, pipelining, maximizing overlap between computation and communication, and scheduling efficiently across heterogeneous devices of vastly different capacities.
  • Keywords
    graphics processing units; parallel processing; resource allocation; scheduling; CPU+GPU cluster programming; accelerator; data transfer; graphics processing unit; heterogeneous machine; high-performance computing; load balancing; runtime framework; scheduling framework; task allocation; transaction style bulk-synchronous semantics; Data transfer; Graphics processing units; Kernel; Message systems; Programming; Runtime; Subscriptions; Heterogeneous Architectures; High Performance Computing; Hybrid CPU-GPU Clusters; Multi Scheduling; Work Stealing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International
  • Conference_Location
    Hyderabad
  • ISSN
    1530-2075
  • Type

    conf

  • DOI
    10.1109/IPDPS.2015.12
  • Filename
    7161504