• DocumentCode
    592113
  • Title

    Performance Analysis and Optimization of the Tiled Cholesky Factorization on NUMA Machines

  • Author

    Jeannot, Emmanuel

  • Author_Institution
    LaBRI, Inria Bordeaux Sud-Ouest, Bordeaux, France
  • fYear
    2012
  • fDate
    17-20 Dec. 2012
  • Firstpage
    210
  • Lastpage
    217
  • Abstract
    We discuss some performance issues of the tiled Cholesky factorization on non-uniform memory access-time (NUMA) shared memory machines. We show how to optimize thread placement and data placement in order to achieve performance gain up to 50% compared to state-of-the-art libraries such as Plasma or MKL.
  • Keywords
    matrix decomposition; parallel processing; performance evaluation; shared memory systems; MKL; NUMA machines; Plasma; data placement optimization; nonuniform memory access time shared memory machines; performance analysis; performance gain; state-of-the-art libraries; thread placement optimization; tiled Cholesky factorization; Instruction sets; Kernel; Message systems; Parallel processing; Resource management; Tiles; Vectors; Cholesky factorization; NUMA; thread placement;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Architectures, Algorithms and Programming (PAAP), 2012 Fifth International Symposium on
  • Conference_Location
    Taipei
  • ISSN
    2168-3034
  • Print_ISBN
    978-1-4673-4566-8
  • Type

    conf

  • DOI
    10.1109/PAAP.2012.38
  • Filename
    6424759