Title :
Taming Latency in Data Center Networking with Erasure Coded Files
Author :
Yu Xiang ; Aggarwal, Vaneet ; Chen, Yih-Farn R. ; Tian Lan
Author_Institution :
Dept. of ECE, George Washington Univ., Washington, DC, USA
Abstract :
This paper proposes an approach to minimize service latency in a data center network where erasure-coded files are stored on distributed disks/racks and access requests are scattered across the network. Due to limited bandwidth available at both top-of-the-rack and aggregation switches, network bandwidth must be apportioned among different intra-and inter-rack data flows in line with their traffic statistics. We formulate this problem as weighted queuing and employ a class of probabilistic request scheduling policies to derive a closed-form outer-bound of service latency for erasure-coded storage with arbitrary file access patterns and service time distributions. The result enables us to propose a joint latency optimization over three entangled "control knobs": the bandwidth allocation at top-of-the-rack and aggregation switches, the probabilities for scheduling file requests, and the placement of encoded file chunks, which affects data locality. The joint optimization is shown to be a mixed-integer problem. We develop an iterative algorithm which decouples and solves the joint optimization as three sub-problems, which are either convex or solvable via bipartite matching in polynomial time. The proposed algorithm is prototyped in an open-source, distributed file system, Tahoe, and evaluated on a cloud tested with 16 separate physical hosts in an Open Stack cluster. Experiments validate our theoretical latency analysis and show significant latency reduction for diverse file access patterns. The results provide valuable insight on designing low-latency data center networks with erasure-coded storage.
Keywords :
computer centres; distributed databases; iterative methods; open systems; pattern matching; queueing theory; OpenStack cluster; Tahoe; aggregation switches; bandwidth allocation; bipartite matching; control knobs; data center networking; distributed disks; distributed file system; encoded file chunks; erasure coded files; erasure-coded storage; file access patterns; interrack data flows; intrarack data flows; iterative algorithm; joint latency optimization; low-latency data center networks; mixed-integer problem; network bandwidth; open-source system; probabilistic request scheduling policies; service latency; service latency minimization; service time distributions; traffic statistics; weighted queuing; Aggregates; Bandwidth; Delays; Joints; Optimization; Servers; Upper bound; Erasure-coded; data center; service latency; storage;
Conference_Titel :
Cluster, Cloud and Grid Computing (CCGrid), 2015 15th IEEE/ACM International Symposium on
Conference_Location :
Shenzhen
DOI :
10.1109/CCGrid.2015.142