Title :
Performance optimization for distributed intra-node-parallel streaming systems
Author :
Sax, M.J. ; Castellanos, M. ; Qiming Chen ; Meichun Hsu
Author_Institution :
Databases & Inf. Syst.Group, Humboldt-Univ. zu Berlin, Berlin, Germany
Abstract :
The performance of intra-node parallel dataflow programs in the context of streaming systems depends mainly on two parameters: the degree of parallelism for each node of the dataflow program as well as the batching size for each node. In the state-of-the-art systems the user has to specify those values manually. Manual tuning of both parameters is necessary in order to get good performance. However, this process is difficult and time consuming-even for experts. In this paper we introduce and optimization algorithm that optimizes both parameters automatically. We define a novel cost model for intra-node parallel dataflow programs with user-defined functions. Furthermore, we introduce different batching schemes to reduce the number of output buffers, i. e., main memory consumption. We implemented our approach on top of the open source system Storm and ran experiments with different workloads. Our results show a throughput improvement of more than one order of magnitude while the optimization time is less than a second.
Keywords :
distributed processing; optimisation; parallel processing; public domain software; batching size; distributed intranode parallel streaming systems; intranode parallel dataflow programs; main memory consumption; open source system storm; optimization algorithm; optimization time; output buffers; performance optimization; state-of-the-art systems; user defined functions; Layout; Optimization; Parallel processing; Programming; Shape; Storms; Throughput;
Conference_Titel :
Data Engineering Workshops (ICDEW), 2013 IEEE 29th International Conference on
Conference_Location :
Brisbane, QLD
Print_ISBN :
978-1-4673-5303-8
Electronic_ISBN :
978-1-4673-5302-1
DOI :
10.1109/ICDEW.2013.6547428