Why panic()?: improving reliability with restartable file systems

  • Authors:
  • Swaminathan Sundararaman;Sriram Subramanian;Abhishek Rajimwale;Andrea C. Arpaci-Dusseau;Remzi H. Arpaci-Dusseau;Michael M. Swift

  • Affiliations:
  • University of Wisconsin, Madison;University of Wisconsin, Madison;University of Wisconsin, Madison;University of Wisconsin, Madison;University of Wisconsin, Madison;University of Wisconsin, Madison

  • Venue:
  • ACM SIGOPS Operating Systems Review
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The file system is one of the most critical components of the operating system. Almost all applications running in the operating system require file systems to be available for their proper operation. Though file-system availability is critical in many cases, very little work has been done on tolerating file system crashes. In this paper, we propose Membrane, a set of changes to the operating system to support restartable file systems. Membrane allows an operating system to tolerate a broad class of file system failures and does so while remaining transparent to running applications; upon failure, the file system restarts, its state is restored, and pending application requests are serviced as if no failure had occurred. Our initial evaluation ofMembrane with ext2 shows thatMembrane induces little performance overhead and can tolerate a wide range of file system crashes. More critically, Membrane does so with few changes to ext2, thus improving robustness to crashes without mandating intrusive changes to existing filesystem code.