DocumentCode
75543
Title
Exploiting Efficient and Scalable Shuffle Transfers in Future Data Center Networks
Author
Deke Guo ; Junjie Xie ; Xiaolei Zhou ; Xiaomin Zhu ; Wei Wei ; Xueshan Luo
Author_Institution
Coll. of Inf. Syst. & Manage., Nat. Univ. of Defense Technol., Changsha, China
Volume
26
Issue
4
fYear
2015
fDate
April 1 2015
Firstpage
997
Lastpage
1009
Abstract
Distributed computing systems like MapReduce in data centers transfer massive amount of data across successive processing stages. Such shuffle transfers contribute most of the network traffic and make the network bandwidth become a bottleneck. In many commonly used workloads, data flows in such a transfer are highly correlated and aggregated at the receiver side. To lower down the network traffic and efficiently use the available network bandwidth, we propose to push the aggregation computation into the network and parallelize the shuffle and reduce phases. In this paper, we first examine the gain and feasibility of the in-network aggregation with BCube, a novel server-centric networking structure for future data centers. To exploit such a gain, we model the in-network aggregation problem that is NP-hard in BCube. We propose two approximate methods for building the efficient IRS-based incast aggregation tree and SRS-based shuffle aggregation subgraph, solely based on the labels of their members and the data center topology. We further design scalable forwarding schemes based on Bloom filters to implement in-network aggregation over massive concurrent shuffle transfers. Based on a prototype and large-scale simulations, we demonstrate that our approaches can significantly decrease the amount of network traffic and save the data center resources. Our approaches for BCube can be adapted to other servercentric network structures for future data centers after minimal modifications.
Keywords
computational complexity; computer centres; computer networks; data structures; trees (mathematics); BCube; Bloom filters; IRS-based incast aggregation tree; MapReduce; NP-hard problem; SRS-based shuffle aggregation subgraph; data center networks; data center topology; data processing stage; distributed computing system; forwarding scheme; in-network aggregation problem; network bandwidth; network traffic; server-centric networking structure; shuffle transfer; Buildings; Hypercubes; Network topology; Receivers; Servers; Switches; Topology; Data center; data aggregation; shuffle transfer;
fLanguage
English
Journal_Title
Parallel and Distributed Systems, IEEE Transactions on
Publisher
ieee
ISSN
1045-9219
Type
jour
DOI
10.1109/TPDS.2014.2316829
Filename
6787046
Link To Document