• DocumentCode
    607272
  • Title

    Hybrid Embarrassingly Parallel algorithm for heterogeneous CPU/GPU clusters

  • Author

    Bo Yang ; Kai Lu ; Jie Liu ; Xiaoping Wang ; Chunye Gong

  • Author_Institution
    Dept. of Comput. Sci., Nat. Univ. of Defense Technol., Changsha, China
  • fYear
    2012
  • fDate
    3-5 Dec. 2012
  • Firstpage
    373
  • Lastpage
    378
  • Abstract
    High Performance Computing is focusing on heterogeneous architecture. The Embarrassingly Parallel algorithm is typical of Monte Carlo method which are widely applied to many important scientific areas. In this paper, we present an efficient Hybrid Embarrassingly Parallel algorithm for heterogeneous CPU/GPU clusters and an effective task distribution model for the load balancing between CPU and GPU. Our Hybrid EP algorithm can use the computing capability of both multi-core CPU and many-core GPU simultaneously based on the task distribution model. We test Hybrid EP algorithm on various types of CPUs, GPUs and the Tianhe-1A supercomputer. The overall performance speedup of M2050 GPU ranges from 10.84 times compared with six cores X5670 to over 50.53 times compared with quad cores Q6600. The performance of heterogeneous CPU/GPU Tianhe-1A supercomputer, in which both CPU and GPU are sufficiently used, outperforms pure CPU cluster 6.86 times. The speedup increases linearly with the number of nodes and the average efficiency is up to 98.72% for 4096 nodes.
  • Keywords
    Monte Carlo methods; graphics processing units; mainframes; multiprocessing systems; parallel algorithms; parallel architectures; parallel machines; parallel processing; pattern clustering; performance evaluation; resource allocation; M2050 GPU; Monte Carlo method; computing capability; distribution model; heterogeneous CPU-GPU Tianhe-1A supercomputer; heterogeneous CPU-GPU clusters; heterogeneous architecture; high performance computing; hybrid EP algorithm; hybrid embarrassingly parallel algorithm; load balancing; many-core GPU; multicore CPU; quad cores Q6600; six cores X5670; task distribution model; GPU; Hybrid Embarrassingly Parallel; Tianhe-1A; heterogeneous cluster; task distribution model;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computing and Convergence Technology (ICCCT), 2012 7th International Conference on
  • Conference_Location
    Seoul
  • Print_ISBN
    978-1-4673-0894-6
  • Type

    conf

  • Filename
    6530361