Title :
Adapting communication-avoiding LU and QR factorizations to multicore architectures
Author :
Donfack, Simplice ; Grigori, Laura ; Gupta, Alok Kumar
Author_Institution :
INRIA Saclay-Ile de France, Univ. Paris-Sud 11, Orsay, France
Abstract :
In this paper we study algorithms for performing the LU and QR factorizations of dense matrices. Recently, two communication optimal algorithms have been introduced for distributed memory architectures, referred to as communication avoiding CALU and CAQR. In this paper we discuss two algorithms based on CAQR and CALU that are adapted to multicore architectures. They combine ideas to reduce communication from communication avoiding algorithms with asynchronism and dynamic task scheduling. For matrices that are tall and skinny, that is, they have many more rows than columns, the two algorithms outperform the corresponding algorithms from Intel MKL vendor library on a dual-socket, quad-core machine based on Intel Xeon EMT64 processor and on a four-socket, quad-core machine based on AMD Opteron processor. For these matrices, multithreaded CALU outperforms the corresponding routine dgetrf from Intel MKL library up to a factor of 2.3 and the corresponding routine dgetrf from ACML library up to a factor of 5, while multithreaded CAQR outperforms by a factor of 5.3 the corresponding dgeqrf routine from MKL library.
Keywords :
matrix algebra; memory architecture; multi-threading; multiprocessing programs; AMD Opteron processor; Intel MKL; Intel Xeon EMT64 processor; communication optimal algorithms; communication-avoiding LU factorizations; communication-avoiding QR factorizations; dense matrices; distributed memory architectures; multicore architectures; quad-core machine; Binary trees; Dynamic scheduling; Iterative algorithms; Libraries; Matrix decomposition; Memory architecture; Multicore processing; Processor scheduling; Scheduling algorithm; Yarn; LU and QR factorizations; communication avoiding algorithms; multicore architectures;
Conference_Titel :
Parallel & Distributed Processing (IPDPS), 2010 IEEE International Symposium on
Conference_Location :
Atlanta, GA
Print_ISBN :
978-1-4244-6442-5
DOI :
10.1109/IPDPS.2010.5470348