A fast parallel matrix inversion algorithm based on heterogeneous multicore architectures

Author

Denggao Yu;Shiwen He;Yongming Huang;Guangshi Yu;Luxi Yang

Author_Institution

School of Information Science and Engineering, Southeast University, Nanjing 210096, China

fYear

2015

Firstpage

903

Lastpage

907

Abstract

Large matrix inversion is usually a basic step in a wide range of signal processing or numerical problems, such as digital filtering, equalization detection, and etc. It is essential to figure out an algorithm to invert large matrix quickly and accurately. On the other hand, the Graphics Processor Unit (GPU) is able to provide a low-cost and flexible multicore architecture for high performance computing, which has attracted many researchers´ attention for the building of GPU-based software-defined radio (SDR). In this paper, we propose a fast parallel algorithm for matrix inversion on heterogeneous multicore architectures to utilize the computational power of GPU. Our implementation is based on a modified Squared Givens Rotations (SGR) algorithm, which could adapt to the GPU architecture effectively. The result implemented on Compute Unified Device Architecture (CUDA) obtains a speedup ratio more than 20x versus the CPU-based-only algorithm when the matrix become large, and runs at up to 12.14 gigaflops/s on a graphics processor Geforce GT620 in our implementation.

Keywords

"Graphics processing units","Signal processing algorithms","Multicore processing","Matrix decomposition","Mathematical model"

Publisher

ieee

Conference_Titel

Signal and Information Processing (GlobalSIP), 2015 IEEE Global Conference on

Type

conf

DOI

10.1109/GlobalSIP.2015.7418328

Filename

7418328