Scheduling a Parallel Sparse Direct Solver to Multiple GPUs

Author

Kyungjoo Kim ; Eijkhout, Victor

Author_Institution

Dept. of Aerosp. Eng. & Eng. Mech., Univ. of Texas at Austin, Austin, TX, USA

fYear

2013

fDate

20-24 May 2013

Firstpage

1401

Lastpage

1408

Abstract

We present a sparse direct solver using multi-level task scheduling on a modern heterogeneous compute node consisting of a multi-core host processor and multiple GPU accelerators. Our direct solver is based on the multifrontal method, which is characterized by exploiting dense sub problems (fronts) related in an assembly tree. Critical to high performance of the solver is dynamic task allocation to account for the asymmetric performance of heterogeneous devices. Device-specific tasks are generated and adapted to different devices on the course of multifrontal factorization using multi-level matrix partitioning. Large blocks are used to provide coarse grain tasks for fast devices, and some of the blocks are recursively partitioned to supply fine-grained tasks for the next available (slower) devices. Experimental results are obtained from particular problems arising from a high order Finite Element Method.

Keywords

finite element analysis; graphics processing units; matrix decomposition; multiprocessing systems; parallel processing; processor scheduling; GPU accelerators; assembly tree; asymmetric heterogeneous device performance; device-specific tasks; dynamic task allocation; fine-grained tasks; finite element method; multicore host processor; multifrontal factorization; multifrontal method; multilevel matrix partitioning; multilevel task scheduling; parallel sparse direct solver scheduling; Graphics processing units; Libraries; Multicore processing; Partitioning algorithms; Performance evaluation; Processor scheduling; Sparse matrices; Algorithms-by-blocks; Bulk-synchronous model; Heterogeneous architectures; MultiGPU; Multicore; Multifrontal factorization; hp-FEM;

fLanguage

English

Publisher

ieee

Conference_Titel

Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International

Conference_Location

Cambridge, MA

Print_ISBN

978-0-7695-4979-8

Type

conf

DOI

10.1109/IPDPSW.2013.26

Filename

6651033