Extracting flexible, replayable models from large block traces

  • Authors:
  • V. Tarasov;S. Kumar;J. Ma;D. Hildebrand;A. Povzner;G. Kuenning;E. Zadok

  • Affiliations:
  • Stony Brook University;Stony Brook University;Harvey Mudd College;IBM Almaden Research;IBM Almaden Research;Harvey Mudd College;Stony Brook University

  • Venue:
  • FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

I/O traces are good sources of information about realworld workloads; replaying such traces is often used to reproduce the most realistic system behavior possible. But traces tend to be large, hard to use and share, and inflexible in representing more than the exact system conditions at the point the traces were captured. Often, however, researchers are not interested in the precise details stored in a bulky trace, but rather in some statistical properties found in the traces--properties that affect their system's behavior under load. We designed and built a system that (1) extracts many desired properties from a large block I/O trace, (2) builds a statistical model of the trace's salient characteristics, (3) converts the model into a concise description in the language of one or more synthetic load generators, and (4) can accurately replay the models in these load generators. Our system is modular and extensible. We experimented with several traces of varying types and sizes. Our concise models are 4-6% of the original trace size, and our modeling and replay accuracy are over 90%.