HydraFS: a high-throughput file system for the HYDRAstor content-addressable storage system

  • Authors:
  • Cristian Ungureanu;Benjamin Atkin;Akshat Aranya;Salil Gokhale;Stephen Rago;Grzegorz Całkowski;Cezary Dubnicki;Aniruddha Bohra

  • Affiliations:
  • NEC Laboratories America;NEC Laboratories America;NEC Laboratories America;NEC Laboratories America;NEC Laboratories America;NEC Laboratories America;NEC Laboratories America;NEC Laboratories America

  • Venue:
  • FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

A content-addressable storage (CAS) system is a valuable tool for building storage solutions, providing efficiency by automatically detecting and eliminating duplicate blocks; it can also be capable of high throughput, at least for streaming access. However, the absence of a standardized API is a barrier to the use of CAS for existing applications. Additionally, applications would have to deal with the unique characteristics of CAS, such as immutability of blocks and high latency of operations. An attractive alternative is to build a file system on top of CAS, since applications can use its interface without modification. Mapping a file system onto a CAS system efficiently, so as to obtain high duplicate elimination and high throughput, requires a very different design than for a traditional disk subsystem. In this paper, we present the design, implementation, and evaluation of HydraFS, a file system built on top of HYDRAstor, a scalable, distributed, content-addressable block storage system. HydraFS provides high-performance reads and writes for streaming access, achieving 82-100% of the HYDRAstor throughput, while maintaining high duplicate elimination.