Practical techniques for purging deleted data using liveness information

  • Authors:
  • David Boutcher;Abhishek Chandra

  • Affiliations:
  • University of Minnesota, Minneapolis, MN;University of Minnesota, Minneapolis, MN

  • Venue:
  • ACM SIGOPS Operating Systems Review - Research and developments in the Linux kernel
  • Year:
  • 2008
  • Virtual putty

    HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing

Quantified Score

Hi-index 0.00

Visualization

Abstract

The layered design of the Linux operating system hides the liveness of file system data from the underlying block layers. This lack of liveness information prevents the storage system from discarding blocks deleted by the file system, often resulting in poor utilization, security problems, inefficient caching, and migration overheads. In this paper, we define a generic "purge" operation that can be used by a file system to pass liveness information to the block layer with minimal changes in the layer interfaces, allowing the storage system to discard deleted data. We present three approaches for implementing such a purge operation: direct call, zero blocks, and flagged writes, each of which differs in their architectural complexity and potential performance overhead. We evaluate the feasibility of these techniques through a reference implementation of a dynamically resizable copy on write (COW) data store in User Mode Linux (UML). Performance results obtained from this reference implementation show that all these techniques can achieve significant storage savings with a reasonable execution time overhead. At the same time, our results indicate that while the direct call approach has the best performance, the zero block approach provides the best compromise in terms of performance overhead and its semantic and architectural simplicity. Overall, our results demonstrate that passing liveness information across the file system-block layer interface with minimal changes is not only feasible but practical.