DocumentCode :
1796846
Title :
T-Storm: Traffic-Aware Online Scheduling in Storm
Author :
Jielong Xu ; Zhenhua Chen ; Jian Tang ; Sen Su
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Syracuse Univ., Syracuse, NY, USA
fYear :
2014
fDate :
June 30 2014-July 3 2014
Firstpage :
535
Lastpage :
544
Abstract :
Storm has emerged as a promising computation platform for stream data processing. In this paper, we first show inefficiencies of the current practice of Storm scheduling and challenges associated with applying traffic-aware online scheduling in Storm via experimental results and analysis. Motivated by our observations, we design and implement a new stream data processing system based on Storm, namely, T-Storm. Compared to Storm, T-Storm has the following desirable features: 1) based on runtime states, it accelerates data processing by leveraging effective traffic-aware scheduling for assigning/re-assigning tasks dynamically, which minimizes inter-node and inter-process traffic while ensuring no worker nodes are overloaded, 2) it enables fine-grained control over worker node consolidation such that T-Storm can achieve better performance with even fewer worker nodes, 3) it allows hot-swapping of scheduling algorithms and adjustment of scheduling parameters on the fly, and 4) it is transparent to Storm users (i.e., Storm applications can be ported to run on T-Storm without any changes). We conducted real experiments in a cluster using well-known data processing applications for performance evaluation. Extensive experimental results show that compared to Storm (with the default scheduler), T-Storm can achieve over 84% and 27% speedup on lightly and heavily loaded topologies respectively (in terms of average processing time) with 30% less number of worker nodes.
Keywords :
Big Data; Internet; performance evaluation; scheduling; telecommunication traffic; Storm scheduling; T-Storm; internode traffic minimization; interprocess traffic minimization; performance evaluation; scheduling algorithm hot-swapping; stream data processing; traffic-aware online scheduling; worker node consolidation; Data processing; Fasteners; Monitoring; Schedules; Scheduling; Storms; Topology; Big Data; Resource Management; Scheduling; Storm; Stream Data Processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Distributed Computing Systems (ICDCS), 2014 IEEE 34th International Conference on
Conference_Location :
Madrid
ISSN :
1063-6927
Print_ISBN :
978-1-4799-5168-0
Type :
conf
DOI :
10.1109/ICDCS.2014.61
Filename :
6888929
Link To Document :
بازگشت