The Rio file cache: surviving operating system crashes

Authors:
Peter M. Chen;Wee Teck Ng;Subhachandra Chandra;Christopher Aycock;Gurushankar Rajamani;David Lowell
Affiliations:
Computer Science and Engineering Division, Department of Electrical Engineering and Computer Science, University of Michigan;Computer Science and Engineering Division, Department of Electrical Engineering and Computer Science, University of Michigan;Computer Science and Engineering Division, Department of Electrical Engineering and Computer Science, University of Michigan;Computer Science and Engineering Division, Department of Electrical Engineering and Computer Science, University of Michigan;Computer Science and Engineering Division, Department of Electrical Engineering and Computer Science, University of Michigan;Computer Science and Engineering Division, Department of Electrical Engineering and Computer Science, University of Michigan
Venue:
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Year:
1996

Citing 25
Cited 69

Scale and performance in a distributed file system

ACM Transactions on Computer Systems (TOCS)
Reimplementing the Cedar file system using logging and group commit

SOSP '87 Proceedings of the eleventh ACM Symposium on Operating systems principles
Phoenix: a safe in-memory file system

Communications of the ACM
Computer architecture: a quantitative approach

Computer architecture: a quantitative approach
Fault Injection Experiments Using FIAT

IEEE Transactions on Computers
The case for safe RAM

VLDB '89 Proceedings of the 15th international conference on Very large data bases
Fast breakpoints: design and implementation

PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Operating system concepts (3rd ed.)

Operating system concepts (3rd ed.)
MIPS RISC architectures

MIPS RISC architectures
Measurements of a distributed file system

SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Replication in the harp file system

SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Alpha architecture reference manual

Alpha architecture reference manual
Non-volatile memory for fast, reliable file systems

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Efficient data breakpoints

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
The design and implementation of a log-structured file system

ACM Transactions on Computer Systems (TOCS)
Efficient software-based fault isolation

SOSP '93 Proceedings of the fourteenth ACM symposium on Operating systems principles
FINE: A Fault Injection and Monitoring Environment for Tracing the UNIX System Behavior Under Faults

IEEE Transactions on Software Engineering - Special issue on software reliability
Distributed operating systems

Distributed operating systems
eNVy: a non-volatile, main memory storage system

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
FERRARI: A Flexible Software-Based Fault and Error Injection System

IEEE Transactions on Computers - Special issue on fault-tolerant computing
Hive: fault containment for shared-memory multiprocessors

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
A trace-driven analysis of the UNIX 4.2 BSD file system

Proceedings of the tenth ACM symposium on Operating systems principles
Implementation techniques for main memory database systems

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Using Write Protected Data Structures To Improve Software Fault Tolerance in Highly Available Database Management Systems

VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Some requirements for architectural support of software debugging

ASPLOS I Proceedings of the first international symposium on Architectural support for programming languages and operating systems

Free transactions with Rio Vista

Proceedings of the sixteenth ACM symposium on Operating systems principles
Persistent messages in local transactions

PODC '98 Proceedings of the seventeenth annual ACM symposium on Principles of distributed computing
Fast cluster failover using virtual memory-mapped communication

ICS '99 Proceedings of the 13th international conference on Supercomputing
Soft updates: a solution to the metadata update problem in file systems

ACM Transactions on Computer Systems (TOCS)
The Design and Verification of the Rio File Cache

IEEE Transactions on Computers
Improving Performance of Large Physically Indexed Caches by Decoupling Memory Addresses from Cache Addresses

IEEE Transactions on Computers
DualFS: a new journaling file system without meta-data duplication

ICS '02 Proceedings of the 16th international conference on Supercomputing
Increasing relevance of memory hardware errors: a case for recoverable programming models

EW 9 Proceedings of the 9th workshop on ACM SIGOPS European workshop: beyond the PC: new challenges for the operating system
On using network RAM as a non-volatile buffer

Cluster Computing
Volume Leases for Consistency in Large-Scale Systems

IEEE Transactions on Knowledge and Data Engineering
Integrating Reliable Memory in Databases

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
On the Analysis of On-Line Database Reorganization

ADBIS-DASFAA '00 Proceedings of the East-European Conference on Advances in Databases and Information Systems Held Jointly with International Conference on Database Systems for Advanced Applications: Current Issues in Databases and Information Systems
Conquest: Better Performance Through a Disk/Persistent-RAM Hybrid File System

ATEC '02 Proceedings of the General Track of the annual conference on USENIX Annual Technical Conference
The Design and Use of Persistent Memory on the DNCP Hardware Fault-Tolerant Platform

DSN '01 Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS)
Integrating reliable memory in databases

The VLDB Journal — The International Journal on Very Large Data Bases
On using reliable network RAM in networks of workstations

Cluster computing
Comparing disk and memory's resistance to operating system crashes

ISSRE '96 Proceedings of the The Seventh International Symposium on Software Reliability Engineering
Separating agreement from execution for byzantine fault tolerant services

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Improving availability with recursive microreboots: a soft-state system case study

Performance Evaluation - Dependable systems and networks-performance and dependability symposium (DSN-PDS) 2002: Selected papers
The performance impact of I/O optimizations and disk improvements

IBM Journal of Research and Development
Susceptibility of Commodity Systems and Software to Memory Soft Errors

IEEE Transactions on Computers
Recovering Internet Service Sessions from Operating System Failures

IEEE Internet Computing
Timing-accurate Storage Emulation

FAST '02 Proceedings of the 1st USENIX Conference on File and Storage Technologies
Mondrix: memory isolation for linux using mondriaan memory protection

Proceedings of the twentieth ACM symposium on Operating systems principles
Fast and transparent recovery for continuous availability of cluster-based servers

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
The Conquest file system: Better performance through a disk/persistent-RAM hybrid design

ACM Transactions on Storage (TOS)
Reliability mechanisms for file systems using non-volatile memory as a metadata store

EMSOFT '06 Proceedings of the 6th ACM & IEEE International conference on Embedded software
Recovering device drivers

ACM Transactions on Computer Systems (TOCS)
The Design of New Journaling File Systems: The DualFS Case

IEEE Transactions on Computers
Extending ACID semantics to the file system

ACM Transactions on Storage (TOS)
Flashback: a lightweight extension for rollback and deterministic replay for software debugging

ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Exploring failure transparency and the limits of generic recovery

OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Recovering device drivers

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Global memory management for a multi computer system

WSS'00 Proceedings of the 4th conference on USENIX Windows Systems Symposium - Volume 4
Rethink the sync

OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
A comparison of file system workloads

ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
Journaling versus soft updates: asynchronous meta-data protection in file systems

ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
JVM susceptibility to memory errors

JVM'01 Proceedings of the 2001 Symposium on JavaTM Virtual Machine Research and Technology Symposium - Volume 1
CompulsiveFS: making NVRAM suitable for extremely reliable storage

FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
An application-aware data storage model

ATEC '99 Proceedings of the annual conference on USENIX Annual Technical Conference
Exploiting non-volatile RAM to enhance flash file system performance

EMSOFT '07 Proceedings of the 7th ACM & IEEE international conference on Embedded software
Rethink the sync

OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Improving dependability by revisiting operating system design

HotDep'07 Proceedings of the 3rd workshop on on Hot Topics in System Dependability
PRIMS: making NVRAM suitable for extremely reliable storage

HotDep'07 Proceedings of the 3rd workshop on on Hot Topics in System Dependability
Exploring recovery from operating system lockups

ATC'07 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference
Rethink the sync

ACM Transactions on Computer Systems (TOCS)
Impact of NVRAM write cache for file system metadata on I/O performance in embedded systems

Proceedings of the 2009 ACM symposium on Applied Computing
Better I/O through byte-addressable, persistent memory

Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Towards greener data centers with storage class memory: minimizing idle power waste through coarse-grain management in fine-grain scale

Proceedings of the 7th ACM international conference on Computing frontiers
Design of fault tolerant system based on runtime behavior tracing

ICACT'10 Proceedings of the 12th international conference on Advanced communication technology
"Otherworld": giving applications a chance to survive OS kernel crashes

HotDep'08 Proceedings of the Fourth conference on Hot topics in system dependability
Operating system support for NVM+DRAM hybrid main memory

HotOS'09 Proceedings of the 12th conference on Hot topics in operating systems
CuriOS: improving reliability through operating system structure

OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Fast and correct performance recovery of operating systems using a virtual machine monitor

Proceedings of the 7th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Consistent and durable data structures for non-volatile byte-addressable memory

FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Timing-accurate storage emulation

FAST'02 Proceedings of the 1st USENIX conference on File and storage technologies
SCMFS: a file system for storage class memory

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Whole-system persistence

ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
VM aware journaling: improving journaling file system performance in virtualization environments

Software—Practice & Experience
Improving Bandwidth Efficiency for Consistent Multistream Storage

ACM Transactions on Storage (TOS)
Energy-efficient and high-performance software architecture for storage class memory

ACM Transactions on Embedded Computing Systems (TECS)
RapiLog: reducing system complexity through verification

Proceedings of the 8th ACM European Conference on Computer Systems
SCMFS: A File System for Storage Class Memory and its Extensions

ACM Transactions on Storage (TOS)
Flash caching on the storage client

USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
Towards greener data centers with storage class memory

Future Generation Computer Systems
Kiln: closing the performance gap between systems with and without persistence support

Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
A Unified Buffer Cache Architecture that Subsumes Journaling Functionality via Nonvolatile Memory

ACM Transactions on Storage (TOS)
Warming up storage-level caches with bonfire

FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
Unioning of the buffer cache and journaling layers with non-volatile memory

FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies

Quantified Score

Hi-index	0.01

Visualization

Abstract

One of the fundamental limits to high-performance, high-reliability file systems is memory's vulnerability to system crashes. Because memory is viewed as unsafe, systems periodically write data back to disk. The extra disk traffic lowers performance, and the delay period before data is safe lowers reliability. The goal of the Rio (RAM I/O) file cache is to make ordinary main memory safe for persistent storage by enabling memory to survive operating system crashes. Reliable memory enables a system to achieve the best of both worlds: reliability equivalent to a write-through file cache, where every write is instantly safe, and performance equivalent to a pure write-back cache, with no reliability-induced writes to disk. To achieve reliability, we protect memory during a crash and restore it during a reboot (a "warm" reboot). Extensive crash tests show that even without protection, warm reboot enables memory to achieve reliability close to that of a write-through file system. Adding protection makes memory even safer than a write-through file system while adding essentially no overhead. By eliminating reliability-induced disk writes, Rio performs 4-22 times as fast as a write-through file system, 2-14 times as fast as a standard Unix file system, and 1-3 times as fast as an optimized system that risks losing 30 seconds of data and metadata.