Efficiently identifying working sets in block I/O streams

  • Authors:
  • Avani Wildani;Ethan L. Miller;Lee Ward

  • Affiliations:
  • Univ. of California, Santa Cruz, Santa Cruz, California;Univ. of California, Santa Cruz, Santa Cruz, California;Computer Science Research Institute, Sandia National Labs, Albuquerque, NM

  • Venue:
  • Proceedings of the 4th Annual International Conference on Systems and Storage
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Identifying groups of blocks that tend to be read or written together in a given environment is the first step towards powerful techniques for device failure isolation and power management. For example, identified groups can be placed together on a single disk, avoiding excess drive activity across an exascale storage system. Unlike previous grouping work, we focus on identifying groupings in data that can be gathered from real, running systems with minimal impact. Using temporal, spatial, and access ordering information from an enterprise data set, we identified a set of groupings that consistently appear, indicating that these are working sets that are likely to be accessed together. We present several techniques to obtain groupings along with a discussion of what techniques best apply to particular types of real systems. We intend to use these preliminary results to inform our search for new types of workloads with a goal of identifying properties of easily separable workloads across different systems and dynamically moving groups in these workloads to reduce disk activity in large storage systems.