Benchmarking OLTP/web databases in the cloud: the OLTP-bench framework

  • Authors:
  • Carlo A. Curino;Djellel E. Difallah;Andrew Pavlo;Philippe Cudre-Mauroux

  • Affiliations:
  • Microsoft, Mountain View, CA, USA;University of Fribourg, Fribourg, Switzerland;Brown, Providence, RI, USA;University of Fribourg, Fribourg, Switzerland

  • Venue:
  • Proceedings of the fourth international workshop on Cloud data management
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Benchmarking is a key activity in building and tuning data management systems, but the lack of reference workloads and a common platform makes it a time consuming and painful task. The need for such a tool is heightened with the advent of cloud computing--with its pay-per-use cost models, shared multi-tenant infrastructures, and lack of control on system configuration. Benchmarking is the only avenue for users to validate the quality of service they receive and to optimize their deployments for performance and resource utilization. In this talk, we present our experience in building several adhoc benchmarking infrastructures for various research projects targeting several OLTP DBMSs, ranging from traditional relational databases, main-memory distributed systems, and cloud-based scalable architectures. We also discuss our struggle to build meaningful micro-benchmarks and gather workloads representative of real-world applications to stress-test our systems. This experience motivates the OLTP-Bench project, a batteries-included benchmarking infrastructure designed for and tested on several relational DBMSs and cloud-based database-as-a-service (DBaaS) offerings. OLTP-Bench is capable of controlling transaction rate, mixture, and workload skew dynamically during the execution of an experiment, thus allowing the user to simulate a multitude of practical scenarios that are typically hard to test (e.g., time-evolving access skew). Moreover, the infrastructure provides an easy way to monitor performance and resource consumption of the database under test. We also introduce the ten included workloads, derived from either synthetic micro benchmarks, popular benchmarks, and real world applications, and how they can be used to investigate various performance and resource-consumption characteristics of a data management system. We showcase the effectiveness of our benchmarking infrastructure and the usefulness of the workloads we selected by reporting sample results from hundreds of side-byside comparisons on popular DBMSs and DBaaS offerings.