Abstract :
Autoscaling has become an integral feature of cloud computing services, allowing users to dynamically scale the cloud resources on demand for both performance and cost. Moreover, recent survey shows the importance of satisfying long-term budget constraints (e.g., monthly or yearly) for cloud users. However, meeting such constraints while optimizing delay performance is challenging: it requires the knowledge of complete offline information such as workload demand over the entire budgeting period, which is difficult to predict accurately. This paper proposes a new autoscaling system, BATS, which optimizes delay performance while meeting long-term budget constraints using only past and instantaneous workload information. Analytically, we prove that, for arbitrary workload arrival, the autoscaling algorithm of BATS achieves close-to-optimal performance even compared to the optimal solution that has complete offline information. Empirically, we build BATS autoscaler as a user-friendly service for running applications on Windows Azure. The experimental results show that BATS achieves both lower cost and less delay compared with the state-of-art threshold-based autoscaling solutions. We also run simulation studies to complement the implementation results, demonstrating the effectiveness, scalability and robustness of BATS for reducing both average and tail latency under various workload scenarios.
Keywords :
"Delays","Cloud computing","Prediction algorithms","Algorithm design and analysis","Runtime","Computational modeling"
Conference_Titel :
Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), 2015 IEEE 23rd International Symposium on