Title :
Auto-tuning of Cloud-Based In-Memory Transactional Data Grids via Machine Learning
Author :
Di Sanzo, Pierangelo ; Rughetti, Diego ; Ciciani, Bruno ; Quaglia, Francesco
Author_Institution :
DIAG, Sapienza Univ., Rome, Italy
Abstract :
In-memory transactional data grids have revealed extremely suited for cloud based environments, given that they well fit elasticity requirements imposed by the pay-as-you-go cost model. Particularly, the non-reliance on stable storage devices simplifies dynamic resize of these platforms, which typically only involves setting up (or shutting down) some data-cache instance. On the other hand, defining the well suited amount of cache servers to be deployed, and the degree of replication of slices of data, in order to optimize reliability/availability and performance tradeoffs, is far from being a trivial task. As a example, scaling up/down the size of the underlying infrastructure might give rise to scarcely predictable secondary effects on the side of the synchronization protocol adopted to guarantee data consistency while supporting transactional accesses. In this paper we investigate on the usage of machine learning approaches with the aim at providing a means for automatically tuning the data grid configuration, which is achieved via dynamic selection of both the well suited amount of cache servers, and the well suited degree of replication of the data-objects. The final target is to determine configurations that are able to guarantee specific throughput or latency values (such as those established by some SLA), under some specific workload profile/intensity, while minimizing at the same time the cost for the cloud infrastructure. Our proposal has been integrated within an operating environment relying on the well known Infinispan data grid, namely a mainstream open source product by the Red Had JBoss division. Some experimental data are also provided supporting the effectiveness of our proposal, which have been achieved by deploying the data platform on top of Amazon EC2.
Keywords :
cache storage; cloud computing; grid computing; learning (artificial intelligence); public domain software; Amazon EC2; Infinispan data grid; Red Had JBoss division; cache servers; cloud-based in-memory transactional data grids auto-tuning; data consistency; elasticity requirements; machine learning; mainstream open source product; pay-as-you-go cost model; predictable secondary effects; stable storage devices; transactional accesses; workload profile-intensity; Benchmark testing; Machine learning; Neural networks; Proposals; Servers; Throughput; Time factors; cloud compitung; in-memory data platforms; transactional data platforms;
Conference_Titel :
Network Cloud Computing and Applications (NCCA), 2012 Second Symposium on
Conference_Location :
London
Print_ISBN :
978-1-4673-5581-0
DOI :
10.1109/NCCA.2012.20