• DocumentCode
    1998169
  • Title

    Scheduling a Parallel Sparse Direct Solver to Multiple GPUs

  • Author

    Kyungjoo Kim ; Eijkhout, Victor

  • Author_Institution
    Dept. of Aerosp. Eng. & Eng. Mech., Univ. of Texas at Austin, Austin, TX, USA
  • fYear
    2013
  • fDate
    20-24 May 2013
  • Firstpage
    1401
  • Lastpage
    1408
  • Abstract
    We present a sparse direct solver using multi-level task scheduling on a modern heterogeneous compute node consisting of a multi-core host processor and multiple GPU accelerators. Our direct solver is based on the multifrontal method, which is characterized by exploiting dense sub problems (fronts) related in an assembly tree. Critical to high performance of the solver is dynamic task allocation to account for the asymmetric performance of heterogeneous devices. Device-specific tasks are generated and adapted to different devices on the course of multifrontal factorization using multi-level matrix partitioning. Large blocks are used to provide coarse grain tasks for fast devices, and some of the blocks are recursively partitioned to supply fine-grained tasks for the next available (slower) devices. Experimental results are obtained from particular problems arising from a high order Finite Element Method.
  • Keywords
    finite element analysis; graphics processing units; matrix decomposition; multiprocessing systems; parallel processing; processor scheduling; GPU accelerators; assembly tree; asymmetric heterogeneous device performance; device-specific tasks; dynamic task allocation; fine-grained tasks; finite element method; multicore host processor; multifrontal factorization; multifrontal method; multilevel matrix partitioning; multilevel task scheduling; parallel sparse direct solver scheduling; Graphics processing units; Libraries; Multicore processing; Partitioning algorithms; Performance evaluation; Processor scheduling; Sparse matrices; Algorithms-by-blocks; Bulk-synchronous model; Heterogeneous architectures; MultiGPU; Multicore; Multifrontal factorization; hp-FEM;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International
  • Conference_Location
    Cambridge, MA
  • Print_ISBN
    978-0-7695-4979-8
  • Type

    conf

  • DOI
    10.1109/IPDPSW.2013.26
  • Filename
    6651033