The Rio file cache: surviving operating system crashes

  • Authors:
  • Peter M. Chen;Wee Teck Ng;Subhachandra Chandra;Christopher Aycock;Gurushankar Rajamani;David Lowell

  • Affiliations:
  • Computer Science and Engineering Division, Department of Electrical Engineering and Computer Science, University of Michigan;Computer Science and Engineering Division, Department of Electrical Engineering and Computer Science, University of Michigan;Computer Science and Engineering Division, Department of Electrical Engineering and Computer Science, University of Michigan;Computer Science and Engineering Division, Department of Electrical Engineering and Computer Science, University of Michigan;Computer Science and Engineering Division, Department of Electrical Engineering and Computer Science, University of Michigan;Computer Science and Engineering Division, Department of Electrical Engineering and Computer Science, University of Michigan

  • Venue:
  • Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
  • Year:
  • 1996

Quantified Score

Hi-index 0.01

Visualization

Abstract

One of the fundamental limits to high-performance, high-reliability file systems is memory's vulnerability to system crashes. Because memory is viewed as unsafe, systems periodically write data back to disk. The extra disk traffic lowers performance, and the delay period before data is safe lowers reliability. The goal of the Rio (RAM I/O) file cache is to make ordinary main memory safe for persistent storage by enabling memory to survive operating system crashes. Reliable memory enables a system to achieve the best of both worlds: reliability equivalent to a write-through file cache, where every write is instantly safe, and performance equivalent to a pure write-back cache, with no reliability-induced writes to disk. To achieve reliability, we protect memory during a crash and restore it during a reboot (a "warm" reboot). Extensive crash tests show that even without protection, warm reboot enables memory to achieve reliability close to that of a write-through file system. Adding protection makes memory even safer than a write-through file system while adding essentially no overhead. By eliminating reliability-induced disk writes, Rio performs 4-22 times as fast as a write-through file system, 2-14 times as fast as a standard Unix file system, and 1-3 times as fast as an optimized system that risks losing 30 seconds of data and metadata.