Title :
Efficient MPI Collective Operations for Clusters in Long-and-Fast Networks
Author :
Matsuda, Motohiko ; Kudoh, Tomohiro ; Kodama, Yuetsu ; Takano, Ryousei ; Ishikawa, Yutaka
Author_Institution :
Grid Technol. Res. Center, Nat. Inst. of Adv. Ind. Sci. & Technol.
Abstract :
Several MPI systems for grid environment, in which clusters are connected by wide-area networks, have been proposed. However, the algorithms of collective communication in such MPI systems assume relatively low bandwidth wide-area networks, and they are not designed for the fast wide-area networks that are becoming available. On the other hand, for cluster MPI systems, a beast algorithm by van de Geijn et al. and an allreduce algorithm by Rabenseifner have been proposed, which are efficient in a high bisection bandwidth environment. We modify those algorithms so as to effectively utilize fast wide-area inter-cluster networks and to control the number of nodes which can transfer data simultaneously through wide-area networks to avoid congestion. We confirmed the effectiveness of the modified algorithms by experiments using a 10 Gbps emulated WAN environment. The environment consists of two clusters, where each cluster consists of nodes with 1 Gbps Ethernet links and a switch with a 10 Gbps upper link. The two clusters are connected through a 10 Gbps WAN emulator which can insert latency. In a 10 millisecond latency environment, when the message size is 32 MB, the proposed beast and allreduce are 1.6 and 3.2 times faster, respectively, than the algorithms used in existing MPI systems for grid environment
Keywords :
grid computing; message passing; wide area networks; workstation clusters; 1 Gbits/s; 10 Gbit/s; Ethernet; allreduce algorithm; bandwidth environment; beast algorithm; cluster MPI system; grid environment; long-and-fast networks; wide-area intercluster networks; Algorithm design and analysis; Bandwidth; Clustering algorithms; Delay; Educational technology; Ethernet networks; Network interfaces; Optical scattering; Switches; Wide area networks;
Conference_Titel :
Cluster Computing, 2006 IEEE International Conference on
Conference_Location :
Barcelona
Print_ISBN :
1-4244-0327-8
Electronic_ISBN :
1552-5244
DOI :
10.1109/CLUSTR.2006.311848