DocumentCode :
3248470
Title :
A dynamic programming approach to optimizing the blocking strategy for the Householder QR decomposition
Author :
Fukaya, Takeshi ; Yamamoto, Yusaku ; Zhang, Shao-liang
Author_Institution :
Grad. Sch. of Eng., Nagoya Univ., Nagoya
fYear :
2008
fDate :
Sept. 29 2008-Oct. 1 2008
Firstpage :
402
Lastpage :
410
Abstract :
In this paper, we present a new approach to optimizing the blocking strategy for the householder QR decomposition. In high performance implementations of the householder QR algorithm, it is common to use a blocking technique for the efficient use of the cache memory. There are several well known blocking strategies like the fixed-size blocking and recursive blocking, and usually their parameters such as the block size and the recursion level are tuned according to the target machine and the problem size. However, strategies generated with this kind of parameter optimization constitute only a small fraction of all possible blocking strategies. Given the complex performance characteristics of modern microprocessors, non-standard strategies may prove effective on some machines. Considering this situation, we first propose a new universal model that can express a far larger class of blocking strategies than has been considered so far. Next, we give an algorithm to find a near-optimal strategy from this class using dynamic programming. As a result of this approach, we found an effective blocking strategy that has never been reported. Performance evaluation on the Opteron and Core2 processors show that our strategy achieves about 1.2 times speedup over recursive blocking when computing the QR decomposition of a 6000 times 6000 matrix.
Keywords :
cache storage; dynamic programming; matrix decomposition; microprocessor chips; Core2 processor; Opteron processor; blocking strategy; cache memory; dynamic programming; householder QR decomposition; microprocessor; parameter optimization; Aggregates; Binary trees; Cache memory; Dynamic programming; Heuristic algorithms; High performance computing; Least squares methods; Matrix decomposition; Microprocessors; Singular value decomposition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster Computing, 2008 IEEE International Conference on
Conference_Location :
Tsukuba
ISSN :
1552-5244
Print_ISBN :
978-1-4244-2639-3
Electronic_ISBN :
1552-5244
Type :
conf
DOI :
10.1109/CLUSTR.2008.4663801
Filename :
4663801
Link To Document :
بازگشت