DocumentCode :
2789523
Title :
Bandwidth Efficient All-reduce Operation on Tree Topologies
Author :
Patarasuk, Pitch ; Yuan, Xin
Author_Institution :
Dept. of Comput. Sci., Florida State Univ., Tallahassee, FL
fYear :
2007
fDate :
26-30 March 2007
Firstpage :
1
Lastpage :
8
Abstract :
We consider efficient implementations of the all-reduce operation with large data sizes on tree topologies. We prove a tight lower bound of the amount of data that must be transmitted to carry out the all-reduce operation and use it to derive the lower bound for the communication time of this operation. We develop a topology specific algorithm that is bandwidth efficient in that (1) the amount of data sent/received by each process is minimum for this operation; and (2) the communications do not incur network contention on the tree topology. With the proposed algorithm, the all-reduce operation can be realized on the tree topology as efficiently as on any other topology when the data size is sufficiently large. The proposed algorithm can be applied to several contemporary cluster environments, including high-end clusters of workstations with SMP and/or multi-core nodes and low-end Ethernet switched clusters. We evaluate the algorithm on various clusters of workstations, including a Myrinet cluster with dual-processor SMP nodes, an InfiniBand cluster with two dual-core processors SMP nodes, and an Ethernet switched cluster with single processor nodes. The results show that the routines implemented based on the proposed algorithm significantly outperform the native MPI_Allreduce and other recently developed algorithms for high-end SMP clusters when the data size is sufficiently large.
Keywords :
application program interfaces; bandwidth allocation; message passing; tree data structures; workstation clusters; Ethernet switched cluster; Myrinet cluster; SMP cluster; application program interface; bandwidth allocation; dual-processor SMP node; message passing; tree topology; Bandwidth; Clustering algorithms; Communication switching; Computer science; Ethernet networks; Message passing; Network topology; Switches; Tree graphs; Workstations;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium, 2007. IPDPS 2007. IEEE International
Conference_Location :
Long Beach, CA
Print_ISBN :
1-4244-0910-1
Electronic_ISBN :
1-4244-0910-1
Type :
conf
DOI :
10.1109/IPDPS.2007.370405
Filename :
4228133
Link To Document :
بازگشت