DocumentCode :
3204539
Title :
Multifrontal Factorization of Sparse SPD Matrices on GPUs
Author :
George, Thomas ; Saxena, Vaibhav ; Gupta, Anshul ; Singh, Amik ; Choudhury, Anamitra R.
Author_Institution :
High Performance Comput. Group, IBM Res. India, New Delhi, India
fYear :
2011
fDate :
16-20 May 2011
Firstpage :
372
Lastpage :
383
Abstract :
Solving large sparse linear systems is often the most computationally intensive component of many scientific computing applications. In the past, sparse multifrontal direct factorization has been shown to scale to thousands of processors on dedicated supercomputers resulting in a substantial reduction in computational time. In recent years, an alternative computing paradigm based on GPUs has gained prominence, primarily due to its affordability, power-efficiency, and the potential to achieve significant speedup relative to desktop performance on regular and structured parallel applications. However, sparse matrix factorization on GPUs has not been explored sufficiently due to the complexity involved in an efficient implementation and concerns of low GPU utilization. In this paper, we present an adaptive hybrid approach for accelerating sparse multifrontal factorization based on a judicious exploitation of the processing power of the host CPU and GPU. We present four different policies for distributing and scheduling the workload between the host CPU and the GPU, and propose a mechanism for a runtime selection of the appropriate policy for each step of sparse Cholesky factorization. This mechanism relies on auto-tuning based on modeling the best policy predictor as a parametric classifier. We estimate the classifier parameters from the available empirical computation time data such that the expected computation time is minimized. This approach is readily adaptable for using the current or an extended set of policies for different CPU-GPU combinations as well as for different combinations of dense kernels for both the CPU and the GPU.
Keywords :
coprocessors; mathematics computing; matrix decomposition; pattern classification; processor scheduling; sparse matrices; CPU; GPU; adaptive hybrid approach; best policy predictor modeling; dedicated supercomputer; multifrontal factorization; parametric classifier; processor; runtime selection; scientific computing application; sparse Cholesky factorization; sparse linear system; sparse matrix factorization; sparse multifrontal direct factorization; sparse symmetric positive definite matrices; structured parallel application; workload distributing; workload scheduling; Computational modeling; Graphics processing unit; Instruction sets; Kernel; Libraries; Sparse matrices; Symmetric matrices;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel & Distributed Processing Symposium (IPDPS), 2011 IEEE International
Conference_Location :
Anchorage, AK
ISSN :
1530-2075
Print_ISBN :
978-1-61284-372-8
Electronic_ISBN :
1530-2075
Type :
conf
DOI :
10.1109/IPDPS.2011.44
Filename :
6012808
Link To Document :
بازگشت