DocumentCode :
2547084
Title :
Building Multirail InfiniBand Clusters: MPI-Level Design and Performance Evaluation
Author :
Liu, Jiuxing ; Vishnu, Abhinav ; Panda, Dhabaleswar K.
Author_Institution :
Ohio State University
fYear :
2004
fDate :
06-12 Nov. 2004
Firstpage :
33
Lastpage :
33
Abstract :
In the area of cluster computing, InfiniBand is becoming increasingly popular due to its open standard and high performance. However, even with InfiniBand, network bandwidth can still become the performance bottleneck for some of today’s most demanding applications. In this paper, we study the problem of how to overcome the bandwidth bottleneck by using multirail networks. We present different ways of setting up multirail networks with InfiniBand and propose a unified MPI design that can support all these approaches. We have also discussed various important design issues and provided in-depth discussions of different policies of using multirail networks, including an adaptive striping scheme that can dynamically change the striping parameters based on current system condition. We have implemented our design and evaluated it using both microbenchmarks and applications. Our performance results show that multirail networks can significant improve MPI communication performance. With a two rail InfiniBand cluster, we have achieved almost twice the bandwidth and half the latency for large messages compared with the original MPI. At the application level, the multirail MPI can significantly reduce communication time as well as running time depending on the communication pattern. We have also shown that the adaptive striping scheme can achieve excellent performance without a priori knowledge of the bandwidth of each rail.
Keywords :
Bandwidth; Buildings; Communication switching; Delay; Fabrics; Protocols; Rails; Read-write memory; Round robin; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Supercomputing, 2004. Proceedings of the ACM/IEEE SC2004 Conference
Print_ISBN :
0-7695-2153-3
Type :
conf
DOI :
10.1109/SC.2004.15
Filename :
1392963
Link To Document :
بازگشت