Auto-tuning of Cloud-Based In-Memory Transactional Data Grids via Machine Learning

Authors:
Pierangelo Di Sanzo;Diego Rughetti;Bruno Ciciani;Francesco Quaglia
Affiliations:
-;-;-;-
Venue:
NCCA '12 Proceedings of the 2012 Second Symposium on Network Cloud Computing and Applications
Year:
2012

Citing 0
Cited 1

A framework for high performance simulation of transactional data grid platforms

Proceedings of the 6th International ICST Conference on Simulation Tools and Techniques

Quantified Score

Hi-index	0.00

Visualization

Abstract

In-memory transactional data grids have revealed extremely suited for cloud based environments, given that they well fit elasticity requirements imposed by the pay-as-you-go cost model. Particularly, the non-reliance on stablestorage devices simplifies dynamic resize of these platforms, which typically only involves setting up (or shutting down) some data-cache instance. On the other hand, defifining the well suited amount of cache servers to be deployed, and the degree of replication of slices of data, in order to optimize reliability/availability and performance tradeoffs, is far frombeing a trivial task. As a example, scaling up/down the size of the underlying infrastructure might give rise to scarcely predictable secondary effects on the side of the synchronization protocol adopted to guarantee data consistency while supporting transactional accesses. In this paper we investigate on the usage of machine learning approaches with the aim at providing a means for automatically tuning the data grid confifiguration,聽聽which is achieved via dynamic selection of both the well suited amount of cache servers, and the well suited degree of replication of the data-objects. The final target is to determine confifigurations that are able to guarantee specifific throughput or latency values (such as those established by some SLA), under some specifific workload profifile/intensity, while minimizing at the same time the cost for the cloud infrastructure. Our proposal has been integrated within an operating environment relying on the well known Infinispan data grid, namely a mainstream open source product by theRed Had JBoss division. Some experimental data are also provided supporting the effectiveness of our proposal, which have been achieved by deploying the data platform on top of Amazon EC2.