C-Miner: Mining Block Correlations in Storage Systems

  • Authors:
  • Zhenmin Li;Zhifeng Chen;Sudarshan M. Srinivasan;Yuanyuan Zhou

  • Affiliations:
  • University of Illinois at Urbana-Champaign;University of Illinois at Urbana-Champaign;University of Illinois at Urbana-Champaign;University of Illinois at Urbana-Champaign

  • Venue:
  • FAST '04 Proceedings of the 3rd USENIX Conference on File and Storage Technologies
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Block correlations are common semantic patterns in storage systems. These correlations can be exploited for improving the effectiveness of storage caching, prefetching, data layout and disk scheduling. Unfortunately, information about block correlations is not available at the storage system level. Previous approaches for discovering file correlations in file systems do not scale well enough to be used for discovering block correlations in storage systems. In this paper, we propose C-Miner, an algorithm which uses a data mining technique called frequent sequence mining to discover block correlations in storage systems. C-Miner runs reasonably fast with feasible space requirement, indicating that it is a practical tool for dynamically inferring correlations in a storage system. Moreover, we have also evaluated the benefits of block correlation-directed prefetching and data layout through experiments. Our results using real system workloads show that correlation-directed prefetching and data layout can reduce average I/O response time by 12-25% compared to the base case, and 7-20% compared to the commonly used sequential prefetching scheme.