Towards building performance models for data-intensive workloads in public clouds

  • Authors:
  • Rizwan Mian;Patrick Martin;Farhana Zulkernine;Jose Luis Vazquez-Poletti

  • Affiliations:
  • Queen's University, Kingston, ON, Canada;Queen's University, Kingston, ON, Canada;Queen's University, Kingston, ON, Canada;Universidad Complutense de Madrid, Madrid, Spain

  • Venue:
  • Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The cloud computing paradigm provides the "illusion" of infinite resources and, therefore, becomes a promising candidate for large-scale data-intensive computing. In this paper, we explore experiment-driven performance models for data-intensive workloads executing in an infrastructure-as-a-service (IaaS) public cloud. The performance models help in predicting the workload behaviour, and serve as a key component of a larger framework for resource provisioning in the cloud. We determine a suitable prediction technique after comparing popular regression methods. We also enumerate the variables that impact variance in the workload performance in a public cloud. Finally, we build a performance model for a multi-tenant data service in the Amazon cloud. We find that a linear classifier is sufficient in most cases. On a few occasions, a linear classifier is unsuitable and non-linear modeling is required, which is time consuming. Consequently, we recommend that a linear classifier be used in training the performance model in the first instance. If the resulting model is unsatisfactory, then non-linear modeling can be carried out in the next step.