• DocumentCode
    3754162
  • Title

    A fast parallel matrix inversion algorithm based on heterogeneous multicore architectures

  • Author

    Denggao Yu;Shiwen He;Yongming Huang;Guangshi Yu;Luxi Yang

  • Author_Institution
    School of Information Science and Engineering, Southeast University, Nanjing 210096, China
  • fYear
    2015
  • Firstpage
    903
  • Lastpage
    907
  • Abstract
    Large matrix inversion is usually a basic step in a wide range of signal processing or numerical problems, such as digital filtering, equalization detection, and etc. It is essential to figure out an algorithm to invert large matrix quickly and accurately. On the other hand, the Graphics Processor Unit (GPU) is able to provide a low-cost and flexible multicore architecture for high performance computing, which has attracted many researchers´ attention for the building of GPU-based software-defined radio (SDR). In this paper, we propose a fast parallel algorithm for matrix inversion on heterogeneous multicore architectures to utilize the computational power of GPU. Our implementation is based on a modified Squared Givens Rotations (SGR) algorithm, which could adapt to the GPU architecture effectively. The result implemented on Compute Unified Device Architecture (CUDA) obtains a speedup ratio more than 20x versus the CPU-based-only algorithm when the matrix become large, and runs at up to 12.14 gigaflops/s on a graphics processor Geforce GT620 in our implementation.
  • Keywords
    "Graphics processing units","Signal processing algorithms","Multicore processing","Matrix decomposition","Mathematical model"
  • Publisher
    ieee
  • Conference_Titel
    Signal and Information Processing (GlobalSIP), 2015 IEEE Global Conference on
  • Type

    conf

  • DOI
    10.1109/GlobalSIP.2015.7418328
  • Filename
    7418328