EIO: error handling is occasionally correct

Authors:
Haryadi S. Gunawi;Cindy Rubio-González;Andrea C. Arpaci-Dusseau;Remzi H. Arpaci-Dussea;Ben Liblit
Affiliations:
Computer Sciences Department, University of Wisconsin-Madison;Computer Sciences Department, University of Wisconsin-Madison;Computer Sciences Department, University of Wisconsin-Madison;Computer Sciences Department, University of Wisconsin-Madison;Computer Sciences Department, University of Wisconsin-Madison
Venue:
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Year:
2008

Citing 25
Cited 20

Designing robust Java programs with exceptions

SIGSOFT '00/FSE-8 Proceedings of the 8th ACM SIGSOFT international symposium on Foundations of software engineering: twenty-first century applications
Pointer analysis: haven't we solved this problem yet?

PASTE '01 Proceedings of the 2001 ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering
Bugs as deviant behavior: a general approach to inferring errors in systems code

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Inside Windows NT

Inside Windows NT
CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs

CC '02 Proceedings of the 11th International Conference on Compiler Construction
Comparing the Robustness of POSIX Operating Systems

FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
Error Scope on a Computational Grid: Theory and Practice

HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
Improving the reliability of commodity operating systems

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
SafeMem: Exploiting ECC-Memory for Detecting Memory Leaks and Memory Corruption During Production Runs

HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
DISP: Practical, efficient, secure and fault-tolerant distributed data storage

ACM Transactions on Storage (TOS)
CMC: a pragmatic approach to model checking real code

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
DART: directed automated random testing

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
CCured: type-safe retrofitting of legacy software

ACM Transactions on Programming Languages and Systems (TOPLAS)
Error Propagation Profiling of Operating Systems

DSN '05 Proceedings of the 2005 International Conference on Dependable Systems and Networks
IRON file systems

Proceedings of the twentieth ACM symposium on Operating systems principles
Building a reactive immune system for software services

ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Making exception handling work

HOTDEP'06 Proceedings of the 2nd conference on Hot Topics in System Dependability - Volume 2
Checking system rules using system-specific, programmer-written compiler extensions

OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Recovering device drivers

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Microreboot — A technique for cheap recovery

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Using model checking to find serious file system errors

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
StackGuard: automatic adaptive detection and prevention of buffer-overflow attacks

SSYM'98 Proceedings of the 7th conference on USENIX Security Symposium - Volume 7
Under-constrained execution: making automatic code destruction easy and scalable

Proceedings of the 2007 international symposium on Software testing and analysis
Improving file system reliability with I/O shepherding

Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Faults in large distributed systems and what we can do about them

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing

Mining API Error-Handling Specifications from Source Code

FASE '09 Proceedings of the 12th International Conference on Fundamental Approaches to Software Engineering: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009
Error propagation analysis for file systems

Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Automatic generation of library bindings using static analysis

Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Why panic()?: improving reliability with restartable file systems

ACM SIGOPS Operating Systems Review
Membrane: Operating system support for restartable file systems

ACM Transactions on Storage (TOS)
Membrane: operating system support for restartable file systems

FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
An extensible technique for high-precision testing of recovery code

USENIXATC'10 Proceedings of the 2010 USENIX conference on USENIX annual technical conference
Do I use the wrong definition?: DeFuse: definition-use invariants for detecting concurrency and sequential bugs

Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Making the common case the only case with anticipatory memory allocation

FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Defective error/pointer interactions in the Linux kernel

Proceedings of the 2011 International Symposium on Software Testing and Analysis
jVPFS: adding robustness to a secure stacked file system with untrusted local storage components

USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
Efficient Testing of Recovery Code Using Fault Injection

ACM Transactions on Computer Systems (TOCS)
Towards reliable storage systems

Towards reliable storage systems
Making the common case the only case with anticipatory memory allocation

ACM Transactions on Storage (TOS)
Modern B-Tree Techniques

Foundations and Trends in Databases
SymDrive: testing drivers without devices

OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Be conservative: enhancing failure diagnosis with proactive logging

OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
On fault resilience of OpenStack

Proceedings of the 4th annual Symposium on Cloud Computing
A Study of Linux File System Evolution

ACM Transactions on Storage (TOS)
A study of Linux file system evolution

FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

The reliability of file systems depends in part on how well they propagate errors. We develop a static analysis technique, EDP, that analyzes how file systems and storage device drivers propagate error codes. Running our EDP analysis on all file systems and 3 major storage device drivers in Linux 2.6, we find that errors are often incorrectly propagated; 1153 calls (13%) drop an error code without handling it. We perform a set of analyses to rank the robustness of each subsystem based on the completeness of its error propagation; we find that many popular file systems are less robust than other available choices. We confirm that write errors are neglected more often than read errors. We also find that many violations are not cornercase mistakes, but perhaps intentional choices. Finally, we show that inter-module calls play a part in incorrect error propagation, but that chained propagations do not. In conclusion, error propagation appears complex and hard to perform correctly in modern systems.