DocumentCode :
253362
Title :
MTR: Fault tolerant routing in Clos data center network with miswiring links
Author :
Changlin Jiang ; Wei Liang ; Mingwei Xu ; Lili Liu
Author_Institution :
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
fYear :
2014
fDate :
21-23 May 2014
Firstpage :
1
Lastpage :
6
Abstract :
The data center network (DCN) is a key component of cloud computing. With the rapid expansion of cloud computing, the scale of DCN grows bigger and bigger. However, lacking proper engineering management method, engineers may miswire some links while building DCN, which is called “miswiring problem”. And these miswiring links lead to differences between physical topology and design blueprint graph of DCN, resulting in communication error in DCN. The previous works (DAC [1] and ETAC [2]) only detect devices with miswiring links. DAC can not let the network work until engineers fix miswiring links manually, which is a time-consuming and error-prone task. And ETAC only utilize the devices without miswiring links, it excludes devices with miswiring links from working, which wastes link resource and drops down network throughput. In this paper, we focus on miswiring problem in Clos-based DCN network, and an effective algorithm is introduced to detect and correct miswiring links. Moreover, we propose a miswiring tolerant routing protocol (MTR) to embrace miswiring links, increasing the network throughput in the presence of miswiring links. The simulation results show that for a Fat-Tree network with 128,000 servers, our design can efficiently detect and correct miswiring links (at most 20% miswiring links) in less than 120 milliseconds. And in a 32-array Fat-Tree network, compared with ECMP, MTR can reduce the data transmission completion time by 2.5%, 5.43%, 8.74%, and 11.66% when the percentage of miswiring links is 5%, 10%, 15%, and 20%, respectively.
Keywords :
cloud computing; computer centres; computer networks; fault tolerance; routing protocols; telecommunication network topology; Clos data center network; DCN communication error; MTR protocol; blueprint graph design; cloud computing; engineering management method; error-prone task; fat-tree network; fault tolerant routing; miswiring links; physical topology; IP networks; Indexes; Network topology; Ports (Computers); Servers; Switches; Topology;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Local & Metropolitan Area Networks (LANMAN), 2014 IEEE 20th International Workshop on
Conference_Location :
Reno, NV
Type :
conf
DOI :
10.1109/LANMAN.2014.7028643
Filename :
7028643
Link To Document :
بازگشت