• DocumentCode
    39326
  • Title

    Higher Order Method of Moments With a Parallel Out-of-Core LU Solver on GPU/CPU Platform

  • Author

    Xing Mu ; Hou-Xing Zhou ; Kang Chen ; Wei Hong

  • Author_Institution
    State Key Lab. of Millimeter Waves, Southeast Univ., Nanjing, China
  • Volume
    62
  • Issue
    11
  • fYear
    2014
  • fDate
    Nov. 2014
  • Firstpage
    5634
  • Lastpage
    5646
  • Abstract
    In this paper, a full realization of the higher order method of moments (HMoM) with a parallel out-of-core LU solver on GPU/CPU platform is presented in detail, mainly including three parts: In the first part, both global-auxiliary table and local-auxiliary table are introduced for reducing a lot of tedious and repetitive calculations, and then a realization for GPU-oriented programming is proposed and optimized. In the second part, an overlapped grouping of all the curved quadrilaterals is proposed. With this scheme, all the submatrices can be efficiently generated one by one without wasting any calculations with the help of both the video memory and the host memory. In the third part, a GPU-based out-of-core algorithm for LU decomposition is proposed and further developed into a hybrid GPU/CPU algorithm. Numerical examples are provided to test the robustness of the proposed algorithm by comparison with the measurement and/or the traditional MoM with RWG basis functions, and to demonstrate the overall performance of the proposed algorithm by comparison with the existing algorithm for dealing with similar problems. The speedup ratio of the proposed algorithm for generating the HMoM matrix can achieve about from 7 to 12 compared with the GPU-based algorithm in literatures. Also compared with the 8-threaded CPU-based algorithm, the speedup ratio of the proposed algorithm for LU decomposition can exceed 13 for the single precision case and 7 for the double precision case.
  • Keywords
    graphics processing units; matrix algebra; method of moments; parallel algorithms; 8-threaded CPU-based algorithm; GPU-based algorithm; GPU-based out-of-core algorithm; GPU-oriented programming; GPU/CPU platform; HMoM matrix; LU decomposition; RWG basis functions; curved quadrilaterals; global auxiliary table; higher order method of moments; host memory; hybrid GPU/CPU algorithm; local auxiliary table; overlapped grouping; parallel out-of-core LU solver; repetitive calculations; video memory; Graphics processing units; Hard disks; Instruction sets; Method of moments; Programming; Random access memory; System-on-chip; CUDA; GPU; OpenMP; high-order basis function; method of moments (MoM); out-of-core LU solver; parallel algorithm; speedup ratio;
  • fLanguage
    English
  • Journal_Title
    Antennas and Propagation, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-926X
  • Type

    jour

  • DOI
    10.1109/TAP.2014.2350536
  • Filename
    6881670