DocumentCode
1998169
Title
Scheduling a Parallel Sparse Direct Solver to Multiple GPUs
Author
Kyungjoo Kim ; Eijkhout, Victor
Author_Institution
Dept. of Aerosp. Eng. & Eng. Mech., Univ. of Texas at Austin, Austin, TX, USA
fYear
2013
fDate
20-24 May 2013
Firstpage
1401
Lastpage
1408
Abstract
We present a sparse direct solver using multi-level task scheduling on a modern heterogeneous compute node consisting of a multi-core host processor and multiple GPU accelerators. Our direct solver is based on the multifrontal method, which is characterized by exploiting dense sub problems (fronts) related in an assembly tree. Critical to high performance of the solver is dynamic task allocation to account for the asymmetric performance of heterogeneous devices. Device-specific tasks are generated and adapted to different devices on the course of multifrontal factorization using multi-level matrix partitioning. Large blocks are used to provide coarse grain tasks for fast devices, and some of the blocks are recursively partitioned to supply fine-grained tasks for the next available (slower) devices. Experimental results are obtained from particular problems arising from a high order Finite Element Method.
Keywords
finite element analysis; graphics processing units; matrix decomposition; multiprocessing systems; parallel processing; processor scheduling; GPU accelerators; assembly tree; asymmetric heterogeneous device performance; device-specific tasks; dynamic task allocation; fine-grained tasks; finite element method; multicore host processor; multifrontal factorization; multifrontal method; multilevel matrix partitioning; multilevel task scheduling; parallel sparse direct solver scheduling; Graphics processing units; Libraries; Multicore processing; Partitioning algorithms; Performance evaluation; Processor scheduling; Sparse matrices; Algorithms-by-blocks; Bulk-synchronous model; Heterogeneous architectures; MultiGPU; Multicore; Multifrontal factorization; hp-FEM;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International
Conference_Location
Cambridge, MA
Print_ISBN
978-0-7695-4979-8
Type
conf
DOI
10.1109/IPDPSW.2013.26
Filename
6651033
Link To Document