DocumentCode :
3501416
Title :
Designing High Performance and Scalable MPI Intra-node Communication Support for Clusters
Author :
Chai, Lei ; Hartono, Albert ; Panda, Dhabaleswar K.
Author_Institution :
Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH
fYear :
2006
fDate :
25-28 Sept. 2006
Firstpage :
1
Lastpage :
10
Abstract :
As new processor and memory architectures advance, clusters start to be built from larger SMP systems, which makes MPI intra-node communication a critical issue in high performance computing. This paper presents a new design for MPI intra-node communication that aims to achieve both high performance and good scalability in a cluster environment. The design distinguishes small and large messages and handles them differently to minimize the data transfer overhead for small messages and the memory space consumed by large messages. Moreover, the design utilizes the cache efficiently and requires no locking mechanisms to achieve optimal performance even with large system size. This paper also explores various optimization strategies to reduce polling overhead and maintain data locality. We have evaluated our design on NUMA and dual core NUMA (non-uniform memory access) systems. The experimental results on NUMA system show that the new design can improve MPI intra-node latency by up to 35% and bandwidth by up to 50% compared to MVAPICH. While running the bandwidth benchmark, the measured L2 cache miss rate is reduced by half. The new design also improves the performance of MPI collective calls by up to 25%. The results on dual core NUMA system show that the new design can achieve 0.48 musec in CMP latency
Keywords :
message passing; shared memory systems; workstation clusters; L2 cache miss rate; bandwidth benchmark; cluster computing; dual core NUMA system; high performance MPI intra-node communication support; larger SMP systems; memory architectures; multicore processor; nonuniform memory access systems; scalable MPI intra-node communication support; workstation clusters; Bandwidth; Computer architecture; Computer science; Delay; Design engineering; High performance computing; Memory architecture; Multicore processing; Scalability; Sun; Cluster Computing; Intra-node Communication; MPI; Multi-core Processor; Non-Uniform Memory Access (NUMA);
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster Computing, 2006 IEEE International Conference on
Conference_Location :
Barcelona
ISSN :
1552-5244
Print_ISBN :
1-4244-0327-8
Electronic_ISBN :
1552-5244
Type :
conf
DOI :
10.1109/CLUSTR.2006.311850
Filename :
4100356
Link To Document :
بازگشت