• DocumentCode
    228680
  • Title

    RAHTM: Routing Algorithm Aware Hierarchical Task Mapping

  • Author

    Abdel-Gawad, Ahmed H. ; Thottethodi, Mithuna ; Bhatele, Abhinav

  • Author_Institution
    Sch. of Electr. & Comput. Eng., Purdue Univ., West Lafayette, IN, USA
  • fYear
    2014
  • fDate
    16-21 Nov. 2014
  • Firstpage
    325
  • Lastpage
    335
  • Abstract
    The mapping of MPI processes to compute nodes on a supercomputer can have a significant impact on communication performance. For high performance computing (HPC) applications with iterative communication, rich offline analysis of such communication can improve performance by optimizing the mapping. Unfortunately, current practices for at-scale HPC consider only the communication graph and network topology in solving this problem. We propose Routing Algorithm aware Hierarchical Task Mapping (RAHTM) which leverages the knowledge of the routing algorithm to improve task mapping. RAHTM achieves high quality mappings by combining (1) a divide-and-conquer strategy to achieve scalability, (2) a limited search of mappings, and (3) a linear programming based routing-aware approach to evaluate possible mappings in the search space. RAHTM achieves 20% reduction in the communication time and 9% reduction in the overall execution time for three communication-heavy benchmarks scaled up to 16,384 processes on a Blue Gene/Q platform.
  • Keywords
    divide and conquer methods; graph theory; parallel processing; Blue Gene/Q platform; HPC applications; MPI process mapping; RAHTM; communication graph; communication performance; communication-heavy benchmarks; divide-and-conquer strategy; high performance computing applications; iterative communication; linear programming; mapping optimization; network topology; offline analysis; performance improvement; routing algorithm aware hierarchical task mapping; supercomputer; Algorithm design and analysis; Bandwidth; Benchmark testing; Measurement; Network topology; Routing; Topology; divide-and-conquer; linear programming; routing; task mapping; torus;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing, Networking, Storage and Analysis, SC14: International Conference for
  • Conference_Location
    New Orleans, LA
  • Print_ISBN
    978-1-4799-5499-5
  • Type

    conf

  • DOI
    10.1109/SC.2014.32
  • Filename
    7013014