DocumentCode :
2793788
Title :
Performance Characterization of a Hierarchical MPI Implementation on Large-scale Distributed-memory Platforms
Author :
Alam, Sadaf R. ; Barrett, Richard ; Kuehn, Jeffery ; Poole, Steve
Author_Institution :
Sci. Comput. Group, CSCS-Swiss Nat. Supercomput. Center, Manno, Switzerland
fYear :
2009
fDate :
22-25 Sept. 2009
Firstpage :
132
Lastpage :
139
Abstract :
The building blocks of emerging Petascale massively parallel processing (MPP) systems are multi-core processors with four or more cores as a single processing element and a customized network interface. The resulting memory and communication hierarchy of these platforms are now exposed to application developers and end users by creating a hierarchical or multi-core aware message-passing (MPI) programming interface and by providing a handful of runtime, tunable parameters that allows mapping and control of MPI tasks and message handling. We characterize performance of MPI communication patterns and present strategies for optimizing applications performance on Cray XT series systems that are composed of contemporary AMD processors and a proprietary network infrastructure. We highlight dependencies in its memory and network subsystems, which could influence production-level applications performance. We demonstrate that MPI micro-benchmarks could mislead an application developer or end user since these benchmarks often do not expose the interplay between memory allocation and usage in the user space, which depends on the number of tasks or cores and workload characteristics. Our studies show performance improvements compared to the default options for our target scientific benchmarks and production-level applications.
Keywords :
distributed memory systems; electronic messaging; message passing; optimisation; parallel processing; user interfaces; Cray XT series systems; MPI micro-benchmarks; Petascale massively parallel processing systems; application developer; contemporary AMD processors; hierarchical MPI implementation; large-scale distributed-memory platforms; memory subsystems; message handling; multi-core aware message-passing programming interface; multi-core processors; network subsystems; optimisation; proprietary network infrastructure; Application software; Communication system control; Large-scale systems; Multicore processing; Network interfaces; Parallel processing; Petascale computing; Portals; Runtime; Scientific computing; High performance computing; message passing communication; performance evaluation; runtime systems; scientific applications;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Processing, 2009. ICPP '09. International Conference on
Conference_Location :
Vienna
ISSN :
0190-3918
Print_ISBN :
978-1-4244-4961-3
Electronic_ISBN :
0190-3918
Type :
conf
DOI :
10.1109/ICPP.2009.51
Filename :
5361957
Link To Document :
بازگشت