DocumentCode :
592113
Title :
Performance Analysis and Optimization of the Tiled Cholesky Factorization on NUMA Machines
Author :
Jeannot, Emmanuel
Author_Institution :
LaBRI, Inria Bordeaux Sud-Ouest, Bordeaux, France
fYear :
2012
fDate :
17-20 Dec. 2012
Firstpage :
210
Lastpage :
217
Abstract :
We discuss some performance issues of the tiled Cholesky factorization on non-uniform memory access-time (NUMA) shared memory machines. We show how to optimize thread placement and data placement in order to achieve performance gain up to 50% compared to state-of-the-art libraries such as Plasma or MKL.
Keywords :
matrix decomposition; parallel processing; performance evaluation; shared memory systems; MKL; NUMA machines; Plasma; data placement optimization; nonuniform memory access time shared memory machines; performance analysis; performance gain; state-of-the-art libraries; thread placement optimization; tiled Cholesky factorization; Instruction sets; Kernel; Message systems; Parallel processing; Resource management; Tiles; Vectors; Cholesky factorization; NUMA; thread placement;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Architectures, Algorithms and Programming (PAAP), 2012 Fifth International Symposium on
Conference_Location :
Taipei
ISSN :
2168-3034
Print_ISBN :
978-1-4673-4566-8
Type :
conf
DOI :
10.1109/PAAP.2012.38
Filename :
6424759
Link To Document :
بازگشت