• DocumentCode
    3205240
  • Title

    Redesign of Higher-Level Matrix Algorithms for Multicore and Distributed Architectures and Applications in Quantum Monte Carlo Simulation

  • Author

    Lee, Che-Rung ; Bai, Zhaojun

  • Author_Institution
    Dept. of Comput. Sci., Nat. TsingHua Univ., Hsinchu, Taiwan
  • fYear
    2011
  • fDate
    16-20 May 2011
  • Firstpage
    266
  • Lastpage
    274
  • Abstract
    A matrix operation is referred to as a hard-to-parallel matrix operation (HPMO) if it has serial bottlenecks that are hardly parallelizable. Otherwise, it is referred to as an easy-to-parallel matrix operation (EPMO). Empirical evidences showed the performance scalability of an HPMO is significantly poorer than an EPMO on multicore and distributed architectures. As the result, the design of higher-level algorithms for applications, for the performance considerations on multicore and distributed architectures, should avoid the use of HPMOs as the computational kernels. In this paper, as a case study, we present an HPMO-avoiding algorithm for the Green´s function calculation in quantum Monte Carlo simulation. The original algorithm utilizes the QR-decomposition with column pivoting (QRP) as its computational kernel. QRP is an HPMO. The redesigned algorithm maintains the same simulation stability but employs the standard QR decomposition without pivoting (QR), which is an EPMO. Different implementations of the redesigned algorithm on multicore and distributed architectures are investigated. Although some implementations of the redesigned method use about a factor of three more floating-point operations than the original algorithm, they are about 20% faster on a quad core system and 2.5 times faster on a 1024-CPU massively parallel processing system. The broader impact of the redesign of higher-level matrix algorithms to avoid HPMOs in other computational science applications is also discussed.
  • Keywords
    Monte Carlo methods; matrix algebra; multiprocessing systems; parallel architectures; QR-decomposition; column pivoting; distributed architectures; easy-to-parallel matrix operation; hard-to-parallel matrix operation; higher-level matrix algorithms; multicore architectures; quantum Monte Carlo simulation; Algorithms; Green´s function methods; Kernel; Matrix decomposition; Monte Carlo methods; Multicore processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel & Distributed Processing Symposium (IPDPS), 2011 IEEE International
  • Conference_Location
    Anchorage, AK
  • ISSN
    1530-2075
  • Print_ISBN
    978-1-61284-372-8
  • Electronic_ISBN
    1530-2075
  • Type

    conf

  • DOI
    10.1109/IPDPS.2011.34
  • Filename
    6012843