Title :
PLAStiCC: Predictive Look-Ahead Scheduling for Continuous Dataflows on Clouds
Author :
Kumbhare, Alok Gautam ; Simmhan, Yogesh ; Prasanna, Viktor K.
Author_Institution :
Univ. of Southern California, Los Angeles, CA, USA
Abstract :
Scalable stream processing and continuous dataflow systems are gaining traction with the rise of big data due to the need for processing high velocity data in near real time. Unlike batch processing systems such as MapReduce and workflows, static scheduling strategies fall short for continuous data flows due to the variations in the input data rates and the need for sustained throughput. The elastic resource provisioning of cloud infrastructure is valuable to meet the changing resource needs of such continuous applications. However, multi-tenant cloud resources introduce yet another dimension of performance variability that impacts the application´s throughput. In this paper we propose Plastic, an adaptive scheduling algorithm that balances resource cost and application throughput using a prediction-based look-ahead approach. It not only addresses variations in the input data rates but also the underlying cloud infrastructure. In addition, we also propose several simpler static scheduling heuristics that operate in the absence of accurate performance prediction model. These static and adaptive heuristics are evaluated through extensive simulations using performance traces obtained from Amazon AWS IaaS public cloud. Our results show an improvement of up to 20% in the overall profit as compared to the reactive adaptation algorithm.
Keywords :
cloud computing; data flow analysis; profitability; scheduling; software performance evaluation; Amazon AWS IaaS public cloud; MapReduce; PLAStiCC; adaptive scheduling algorithm; application throughput; batch processing systems; cloud infrastructure; continuous dataflow systems; elastic resource provisioning; high velocity data; multitenant cloud resources; performance variability; predictive look-ahead scheduling; profit; reactive adaptation algorithm; resource cost balances; scalable stream processing; static scheduling strategies; Cloud computing; Dynamic scheduling; Optimization; Predictive models; Quality of service; Runtime; Throughput; Continuous Dataflows; Elastic resource management; IaaS Clouds; Predictive scheduling; Stream processing;
Conference_Titel :
Cluster, Cloud and Grid Computing (CCGrid), 2014 14th IEEE/ACM International Symposium on
Conference_Location :
Chicago, IL
DOI :
10.1109/CCGrid.2014.60