Title :
Scalable fast multipole methods on distributed heterogeneous architectures
Author :
Hu, Qi ; Gumerov, Nail A. ; Duraiswami, Ramani
Author_Institution :
Dept. of Comput. Sci., Univ. of Maryland, College Park, MD, USA
Abstract :
We fundamentally reconsider implementation of the Fast Multipole Method (FMM) on a computing node with a heterogeneous CPU-GPU architecture with multicore CPU(s) and one or more GPU accelerators, as well as on an interconnected cluster of such nodes. The FMM is a divide- and-conquer algorithm that performs a fast N-body sum using a spatial decomposition and is often used in a time- stepping or iterative loop. Using the observation that the local summation and the analysis-based translation parts of the FMM are independent, we map these respectively to the GPUs and CPUs. Careful analysis of the FMM is performed to distribute work optimally between the multicore CPUs and the GPU accelerators. We first develop a single node version where the CPU part is parallelized using OpenMP and the GPU version via CUDA. New parallel algorithms for creating FMM data structures are presented together with load balancing strategies for the single node and distributed multiple-node versions. Our implementation can perform the N-body sum for 128M particles on 16 nodes in 4.23 seconds, a performance not achieved by others in the literature on such clusters.
Keywords :
data structures; divide and conquer methods; graphics processing units; iterative methods; multiprocessing systems; parallel architectures; CPU-GPU architecture; CUDA; FMM data structures; GPU accelerators; OpenMP; analysis based translation parts; distributed heterogeneous architectures; divide-and-conquer algorithm; iterative loop; multicore CPU; scalable fast multipole methods; time stepping loop; Arrays; Clustering algorithms; Graphics processing unit; Kernel; Receivers; Sorting;
Conference_Titel :
High Performance Computing, Networking, Storage and Analysis (SC), 2011 International Conference for
Conference_Location :
Seatle, WA
Electronic_ISBN :
978-1-4503-0771-0