Reimplementing the Cedar file system using logging and group commit
SOSP '87 Proceedings of the eleventh ACM Symposium on Operating systems principles
Software diversity in computerized control systems
Software diversity in computerized control systems
The design and implementation of a log-structured file system
ACM Transactions on Computer Systems (TOCS)
File-system development with stackable layers
ACM Transactions on Computer Systems (TOCS) - Special issue on operating systems principles
BASE: using abstraction to improve fault tolerance
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Bugs as deviant behavior: a general approach to inferring errors in systems code
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
An empirical study of operating systems errors
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Increasing relevance of memory hardware errors: a case for recoverable programming models
EW 9 Proceedings of the 9th workshop on ACM SIGOPS European workshop: beyond the PC: new challenges for the operating system
Transaction Processing: Concepts and Techniques
Transaction Processing: Concepts and Techniques
Venti: A New Approach to Archival Storage
FAST '02 Proceedings of the Conference on File and Storage Technologies
Planned Extensions to the Linux Ext2/Ext3 Filesystem
Proceedings of the FREENIX Track: 2002 USENIX Annual Technical Conference
Improving the reliability of commodity operating systems
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Model-Based Failure Analysis of Journaling File Systems
DSN '05 Proceedings of the 2005 International Conference on Dependable Systems and Networks
Proceedings of the twentieth ACM symposium on Operating systems principles
C-Miner: Mining Block Correlations in Storage Systems
FAST '04 Proceedings of the 3rd USENIX Conference on File and Storage Technologies
Automatically Generating Malicious Disks using Symbolic Execution
SP '06 Proceedings of the 2006 IEEE Symposium on Security and Privacy
Increasing distributed storage survivability with a stackable RAID-like file system
CCGRID '05 Proceedings of the Fifth IEEE International Symposium on Cluster Computing and the Grid - Volume 01
An analysis of compare-by-hash
HOTOS'03 Proceedings of the 9th conference on Hot Topics in Operating Systems - Volume 9
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Using model checking to find serious file system errors
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
An analysis of latent sector errors in disk drives
Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
File system design for an NFS file server appliance
WTEC'94 Proceedings of the USENIX Winter 1994 Technical Conference on USENIX Winter 1994 Technical Conference
Single instance storage in Windows® 2000
WSS'00 Proceedings of the 4th conference on USENIX Windows Systems Symposium - Volume 4
EXPLODE: a lightweight, general system for finding serious storage system errors
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
N-variant systems: a secretless framework for security through diversity
USENIX-SS'06 Proceedings of the 15th conference on USENIX Security Symposium - Volume 15
Compare-by-hash: a reasoned analysis
ATEC '06 Proceedings of the annual conference on USENIX '06 Annual Technical Conference
Scalability in the XFS file system
ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference
Tolerating byzantine faults in transaction processing systems using commit barrier scheduling
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Parity lost and parity regained
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
An analysis of data corruption in the storage stack
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
DRAM errors in the wild: a large-scale field study
Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
CuriOS: improving reliability through operating system structure
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
SQCK: a declarative file system checker
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Fast and cautious evolution of cloud storage
HotStorage'10 Proceedings of the 2nd USENIX conference on Hot topics in storage and file systems
Scalable testing of file system checkers
Proceedings of the 7th ACM european conference on Computer Systems
Recon: verifying file system consistency at runtime
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
Recon: Verifying file system consistency at runtime
ACM Transactions on Storage (TOS)
HARDFS: hardening HDFS with selective and lightweight versioning
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
Checking the integrity of transactional mechanisms
FAST'14 Proceedings of the 12th USENIX conference on File and Storage Technologies
Hi-index | 0.00 |
We introduce EnvyFS, an N-version local file system designed to improve reliability in the face of file-system bugs. EnvyFS, implemented as a thin VFS-like layer near the top of the storage stack, replicates file-system metadata and data across existing and diverse commodity file systems (e.g., ext3, ReiserFS, JFS). It uses majority-consensus to operate correctly despite the sometimes faulty behavior of an underlying commodity child file system. Through experimentation, we show EnvyFS is robust to a wide range of failure scenarios, thus delivering on its promise of increased fault tolerance; however, performance and capacity overheads can be significant. To remedy this issue, we introduce SubSIST, a novel single-instance store designed to operate in an N-version environment. In the common case where all child file systems are working properly, SubSIST coalesces most blocks and thus greatly reduces time and space overheads. In the rare case where a child makes a mistake, SubSIST does not propagate the error to other children, and thus preserves the ability of EnvyFS to detect and recover from bugs that affect data reliability. Overall, EnvyFS and SubSIST combine to significantly improve reliability with only modest space and time overheads.