DocumentCode
228680
Title
RAHTM: Routing Algorithm Aware Hierarchical Task Mapping
Author
Abdel-Gawad, Ahmed H. ; Thottethodi, Mithuna ; Bhatele, Abhinav
Author_Institution
Sch. of Electr. & Comput. Eng., Purdue Univ., West Lafayette, IN, USA
fYear
2014
fDate
16-21 Nov. 2014
Firstpage
325
Lastpage
335
Abstract
The mapping of MPI processes to compute nodes on a supercomputer can have a significant impact on communication performance. For high performance computing (HPC) applications with iterative communication, rich offline analysis of such communication can improve performance by optimizing the mapping. Unfortunately, current practices for at-scale HPC consider only the communication graph and network topology in solving this problem. We propose Routing Algorithm aware Hierarchical Task Mapping (RAHTM) which leverages the knowledge of the routing algorithm to improve task mapping. RAHTM achieves high quality mappings by combining (1) a divide-and-conquer strategy to achieve scalability, (2) a limited search of mappings, and (3) a linear programming based routing-aware approach to evaluate possible mappings in the search space. RAHTM achieves 20% reduction in the communication time and 9% reduction in the overall execution time for three communication-heavy benchmarks scaled up to 16,384 processes on a Blue Gene/Q platform.
Keywords
divide and conquer methods; graph theory; parallel processing; Blue Gene/Q platform; HPC applications; MPI process mapping; RAHTM; communication graph; communication performance; communication-heavy benchmarks; divide-and-conquer strategy; high performance computing applications; iterative communication; linear programming; mapping optimization; network topology; offline analysis; performance improvement; routing algorithm aware hierarchical task mapping; supercomputer; Algorithm design and analysis; Bandwidth; Benchmark testing; Measurement; Network topology; Routing; Topology; divide-and-conquer; linear programming; routing; task mapping; torus;
fLanguage
English
Publisher
ieee
Conference_Titel
High Performance Computing, Networking, Storage and Analysis, SC14: International Conference for
Conference_Location
New Orleans, LA
Print_ISBN
978-1-4799-5499-5
Type
conf
DOI
10.1109/SC.2014.32
Filename
7013014
Link To Document