DocumentCode :
3588639
Title :
Heterogeneous CPU-GPU computing for the finite volume method on 3D unstructured meshes
Author :
Langguth, Johannes ; Xing Cai
Author_Institution :
Simula Res. Lab., Lysaker, Norway
fYear :
2014
Firstpage :
191
Lastpage :
199
Abstract :
A recent trend in modern high-performance computing environments is the introduction of accelerators such as GPU and Xeon Phi, i.e. specialized computing devices that are optimized for highly parallel applications and coexist with CPUs. In regular compute-intensive applications with predictable data access patterns, these devices often outperform traditional CPUs by far and thus relegate them to pure control functions instead of computations. For irregular applications however, the gap in relative performance can be much smaller, and sometimes even reversed. Thus, maximizing overall performance in such systems requires that full use of all available computational resources is made. In this paper we study the attainable performance of the cell-centered finite volume method on 3D unstructured tetrahedral meshes using heterogeneous systems consisting of CPUs and multiple GPUs. Finite volume methods are widely used numerical strategies for solving partial differential equations. The advantages of using finite volumes include built-in support for conservation laws and suitability for unstructured meshes. Our focus lies in demonstrating how a workload distribution that maximizes overall performance can be derived from the actual performance attained by the different computing devices in the heterogeneous environment. We also highlight the dual role of partitioning software in reordering and partitioning the input mesh, thus giving rise to a new combined approach to partitioning.
Keywords :
finite volume methods; graphics processing units; mathematics computing; mesh generation; parallel processing; 3D unstructured tetrahedral mesh; cell-centered finite volume method; central processing unit; data access pattern; graphics processing unit; heterogeneous CPU-GPU computing; heterogeneous system; high-performance computing environment; mesh partitioning; mesh reordering; partial differential equation; partitioning software; Bandwidth; Graphics processing units; Hardware; Instruction sets; Multicore processing; Particle separators; Three-dimensional displays;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Systems (ICPADS), 2014 20th IEEE International Conference on
Type :
conf
DOI :
10.1109/PADSW.2014.7097808
Filename :
7097808
Link To Document :
بازگشت