Expedited rating of data stores using agile data loading techniques

  • Authors:
  • Sumita Barahmand;Shahram Ghandeharizadeh

  • Affiliations:
  • University of Southern California, Los Angeles, CA, USA;University of Southern California, Los Angeles, CA, USA

  • Venue:
  • Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

To benchmark and rate a data store, one must repeat experiments that impose a different amount of load on the data store. Workloads that modify the benchmark database may require the same database to be loaded repeatedly. This may constitute a significant portion of the time to rate a data store. This paper presents several agile data loading techniques to expedite the rating process. These techniques include generating the disk image of the database once and re-using it, restoring the updated data items to their original value, maintaining in-memory state of the database across different experiments to avoid repeated loading of the database all together, and a hybrid of the third technique in combination with the other two. These techniques are general purpose and apply to a variety of cloud benchmarks. We investigate their implementation and evaluation in the context of one, the BG benchmark. Obtained results show a factor of two to twelve speedup in the rating process. As an example, when evaluating MongoDB with a million member BG database, we show these techniques expedite BG's rating from 4 months (123 days) of continuous running to less than 11 days for the first rating experiment. Subsequent ratings of MongoDB with different workloads using the same database is much faster, in the order of hours.