Scale and performance in a distributed file system
ACM Transactions on Computer Systems (TOCS)
Reimplementing the Cedar file system using logging and group commit
SOSP '87 Proceedings of the eleventh ACM Symposium on Operating systems principles
Phoenix: a safe in-memory file system
Communications of the ACM
Computer architecture: a quantitative approach
Computer architecture: a quantitative approach
Fault Injection Experiments Using FIAT
IEEE Transactions on Computers
VLDB '89 Proceedings of the 15th international conference on Very large data bases
Fast breakpoints: design and implementation
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Operating system concepts (3rd ed.)
Operating system concepts (3rd ed.)
MIPS RISC architectures
Measurements of a distributed file system
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Replication in the harp file system
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Alpha architecture reference manual
Alpha architecture reference manual
Non-volatile memory for fast, reliable file systems
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
The design and implementation of a log-structured file system
ACM Transactions on Computer Systems (TOCS)
Efficient software-based fault isolation
SOSP '93 Proceedings of the fourteenth ACM symposium on Operating systems principles
FINE: A Fault Injection and Monitoring Environment for Tracing the UNIX System Behavior Under Faults
IEEE Transactions on Software Engineering - Special issue on software reliability
Distributed operating systems
eNVy: a non-volatile, main memory storage system
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
FERRARI: A Flexible Software-Based Fault and Error Injection System
IEEE Transactions on Computers - Special issue on fault-tolerant computing
Hive: fault containment for shared-memory multiprocessors
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
A trace-driven analysis of the UNIX 4.2 BSD file system
Proceedings of the tenth ACM symposium on Operating systems principles
Implementation techniques for main memory database systems
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Some requirements for architectural support of software debugging
ASPLOS I Proceedings of the first international symposium on Architectural support for programming languages and operating systems
Free transactions with Rio Vista
Proceedings of the sixteenth ACM symposium on Operating systems principles
Persistent messages in local transactions
PODC '98 Proceedings of the seventeenth annual ACM symposium on Principles of distributed computing
Fast cluster failover using virtual memory-mapped communication
ICS '99 Proceedings of the 13th international conference on Supercomputing
Soft updates: a solution to the metadata update problem in file systems
ACM Transactions on Computer Systems (TOCS)
The Design and Verification of the Rio File Cache
IEEE Transactions on Computers
IEEE Transactions on Computers
DualFS: a new journaling file system without meta-data duplication
ICS '02 Proceedings of the 16th international conference on Supercomputing
Increasing relevance of memory hardware errors: a case for recoverable programming models
EW 9 Proceedings of the 9th workshop on ACM SIGOPS European workshop: beyond the PC: new challenges for the operating system
On using network RAM as a non-volatile buffer
Cluster Computing
Volume Leases for Consistency in Large-Scale Systems
IEEE Transactions on Knowledge and Data Engineering
Integrating Reliable Memory in Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
On the Analysis of On-Line Database Reorganization
ADBIS-DASFAA '00 Proceedings of the East-European Conference on Advances in Databases and Information Systems Held Jointly with International Conference on Database Systems for Advanced Applications: Current Issues in Databases and Information Systems
Conquest: Better Performance Through a Disk/Persistent-RAM Hybrid File System
ATEC '02 Proceedings of the General Track of the annual conference on USENIX Annual Technical Conference
The Design and Use of Persistent Memory on the DNCP Hardware Fault-Tolerant Platform
DSN '01 Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS)
Integrating reliable memory in databases
The VLDB Journal — The International Journal on Very Large Data Bases
On using reliable network RAM in networks of workstations
Cluster computing
Comparing disk and memory's resistance to operating system crashes
ISSRE '96 Proceedings of the The Seventh International Symposium on Software Reliability Engineering
Separating agreement from execution for byzantine fault tolerant services
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Improving availability with recursive microreboots: a soft-state system case study
Performance Evaluation - Dependable systems and networks-performance and dependability symposium (DSN-PDS) 2002: Selected papers
The performance impact of I/O optimizations and disk improvements
IBM Journal of Research and Development
Susceptibility of Commodity Systems and Software to Memory Soft Errors
IEEE Transactions on Computers
Recovering Internet Service Sessions from Operating System Failures
IEEE Internet Computing
Timing-accurate Storage Emulation
FAST '02 Proceedings of the 1st USENIX Conference on File and Storage Technologies
Mondrix: memory isolation for linux using mondriaan memory protection
Proceedings of the twentieth ACM symposium on Operating systems principles
Fast and transparent recovery for continuous availability of cluster-based servers
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
The Conquest file system: Better performance through a disk/persistent-RAM hybrid design
ACM Transactions on Storage (TOS)
Reliability mechanisms for file systems using non-volatile memory as a metadata store
EMSOFT '06 Proceedings of the 6th ACM & IEEE International conference on Embedded software
ACM Transactions on Computer Systems (TOCS)
The Design of New Journaling File Systems: The DualFS Case
IEEE Transactions on Computers
Extending ACID semantics to the file system
ACM Transactions on Storage (TOS)
Flashback: a lightweight extension for rollback and deterministic replay for software debugging
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Exploring failure transparency and the limits of generic recovery
OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Global memory management for a multi computer system
WSS'00 Proceedings of the 4th conference on USENIX Windows Systems Symposium - Volume 4
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
A comparison of file system workloads
ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
Journaling versus soft updates: asynchronous meta-data protection in file systems
ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
JVM susceptibility to memory errors
JVM'01 Proceedings of the 2001 Symposium on JavaTM Virtual Machine Research and Technology Symposium - Volume 1
CompulsiveFS: making NVRAM suitable for extremely reliable storage
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
An application-aware data storage model
ATEC '99 Proceedings of the annual conference on USENIX Annual Technical Conference
Exploiting non-volatile RAM to enhance flash file system performance
EMSOFT '07 Proceedings of the 7th ACM & IEEE international conference on Embedded software
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Improving dependability by revisiting operating system design
HotDep'07 Proceedings of the 3rd workshop on on Hot Topics in System Dependability
PRIMS: making NVRAM suitable for extremely reliable storage
HotDep'07 Proceedings of the 3rd workshop on on Hot Topics in System Dependability
Exploring recovery from operating system lockups
ATC'07 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference
ACM Transactions on Computer Systems (TOCS)
Impact of NVRAM write cache for file system metadata on I/O performance in embedded systems
Proceedings of the 2009 ACM symposium on Applied Computing
Better I/O through byte-addressable, persistent memory
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Proceedings of the 7th ACM international conference on Computing frontiers
Design of fault tolerant system based on runtime behavior tracing
ICACT'10 Proceedings of the 12th international conference on Advanced communication technology
"Otherworld": giving applications a chance to survive OS kernel crashes
HotDep'08 Proceedings of the Fourth conference on Hot topics in system dependability
Operating system support for NVM+DRAM hybrid main memory
HotOS'09 Proceedings of the 12th conference on Hot topics in operating systems
CuriOS: improving reliability through operating system structure
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Fast and correct performance recovery of operating systems using a virtual machine monitor
Proceedings of the 7th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Consistent and durable data structures for non-volatile byte-addressable memory
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Timing-accurate storage emulation
FAST'02 Proceedings of the 1st USENIX conference on File and storage technologies
SCMFS: a file system for storage class memory
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
VM aware journaling: improving journaling file system performance in virtualization environments
Software—Practice & Experience
Improving Bandwidth Efficiency for Consistent Multistream Storage
ACM Transactions on Storage (TOS)
Energy-efficient and high-performance software architecture for storage class memory
ACM Transactions on Embedded Computing Systems (TECS)
RapiLog: reducing system complexity through verification
Proceedings of the 8th ACM European Conference on Computer Systems
SCMFS: A File System for Storage Class Memory and its Extensions
ACM Transactions on Storage (TOS)
Flash caching on the storage client
USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
Towards greener data centers with storage class memory
Future Generation Computer Systems
Kiln: closing the performance gap between systems with and without persistence support
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
A Unified Buffer Cache Architecture that Subsumes Journaling Functionality via Nonvolatile Memory
ACM Transactions on Storage (TOS)
Warming up storage-level caches with bonfire
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
Unioning of the buffer cache and journaling layers with non-volatile memory
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
Hi-index | 0.01 |
One of the fundamental limits to high-performance, high-reliability file systems is memory's vulnerability to system crashes. Because memory is viewed as unsafe, systems periodically write data back to disk. The extra disk traffic lowers performance, and the delay period before data is safe lowers reliability. The goal of the Rio (RAM I/O) file cache is to make ordinary main memory safe for persistent storage by enabling memory to survive operating system crashes. Reliable memory enables a system to achieve the best of both worlds: reliability equivalent to a write-through file cache, where every write is instantly safe, and performance equivalent to a pure write-back cache, with no reliability-induced writes to disk. To achieve reliability, we protect memory during a crash and restore it during a reboot (a "warm" reboot). Extensive crash tests show that even without protection, warm reboot enables memory to achieve reliability close to that of a write-through file system. Adding protection makes memory even safer than a write-through file system while adding essentially no overhead. By eliminating reliability-induced disk writes, Rio performs 4-22 times as fast as a write-through file system, 2-14 times as fast as a standard Unix file system, and 1-3 times as fast as an optimized system that risks losing 30 seconds of data and metadata.