ACM Transactions on Computer Systems (TOCS)
Efficient software-based fault isolation
SOSP '93 Proceedings of the fourteenth ACM symposium on Operating systems principles
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Exokernel: an operating system architecture for application-level resource management
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Hypervisor-based fault tolerance
ACM Transactions on Computer Systems (TOCS) - Special issue on operating system principles
The Rio file cache: surviving operating system crashes
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Dealing with disaster: surviving misbehaved kernel extensions
OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
The Flux OSKit: a substrate for kernel and language research
Proceedings of the sixteenth ACM symposium on Operating systems principles
Self-paging in the Nemesis operating system
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
Proceedings of the seventeenth ACM symposium on Operating systems principles
An empirical study of operating systems errors
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
IEEE Software
Whither Generic Recovery from Application Faults? A Fault Study using Open-Source Software
DSN '00 Proceedings of the 2000 International Conference on Dependable Systems and Networks (formerly FTCS-30 and DCCA-8)
TFT: A Software System for Application-Transparent Fault Tolerance
FTCS '98 Proceedings of the The Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing
How Fail-Stop are Faulty Programs?
FTCS '98 Proceedings of the The Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing
SOSP '81 Proceedings of the eighth ACM symposium on Operating systems principles
Reliable hardware-software architecture
Proceedings of the international conference on Reliable software
Recursive Restartability: Turning the Reboot Sledgehammer into a Scalpel
HOTOS '01 Proceedings of the Eighth Workshop on Hot Topics in Operating Systems
Recovery Guarantees for General Multi-Tier Applications
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Improving the reliability of commodity operating systems
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Making a Case for Efficient Supercomputing
Queue - Power Management
Exploring failure transparency and the limits of generic recovery
OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Mondrix: memory isolation for linux using mondriaan memory protection
Proceedings of the twentieth ACM symposium on Operating systems principles
Rx: treating bugs as allergies---a safe method to survive software failures
Proceedings of the twentieth ACM symposium on Operating systems principles
Proactive operating system recovery
Proceedings of the twentieth ACM symposium on Operating systems principles
The costs and limits of availability for replicated services
ACM Transactions on Computer Systems (TOCS)
K42: an infrastructure for operating system research
ACM SIGOPS Operating Systems Review
Live updating operating systems using virtualization
Proceedings of the 2nd international conference on Virtual execution environments
MINIX 3: a highly reliable, self-repairing operating system
ACM SIGOPS Operating Systems Review
Intelligent Systems and Formal Methods in Software Engineering
IEEE Intelligent Systems
Solving the starting problem: device drivers as self-describing artifacts
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Providing dynamic update in an operating system
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
VMM-independent graphics acceleration
Proceedings of the 3rd international conference on Virtual execution environments
Emstar: A software environment for developing and deploying heterogeneous sensor-actuator networks
ACM Transactions on Sensor Networks (TOSN)
Kernel support for zero-loss Internet service restart
Software—Practice & Experience
Under-constrained execution: making automatic code destruction easy and scalable
Proceedings of the 2007 international symposium on Software testing and analysis
Rx: Treating bugs as allergies—a safe method to survive software failures
ACM Transactions on Computer Systems (TOCS)
Secure virtual architecture: a safe execution environment for commodity operating systems
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
SafeDrive: safe and recoverable extensions using language-based techniques
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
XFI: software guards for system address spaces
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Improving dependability by revisiting operating system design
HotDep'07 Proceedings of the 3rd workshop on on Hot Topics in System Dependability
Netchannel: a VMM-level mechanism for continuous, transparentdevice access during VM migration
Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
The design and implementation of microdrivers
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Controlled, systematic, and efficient code replacement for running java programs
Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008
Reboots are for hardware: challenges and solutions to updating an operating system on the fly
ATC'07 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference
Exploring recovery from operating system lockups
ATC'07 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference
EIO: error handling is occasionally correct
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Safe device driver model based on kernel-mode JVM
VTDC '07 Proceedings of the 2nd international workshop on Virtualization technology in distributed computing
Toasters, Seat Belts, and Inferring Program Properties
Verified Software: Theories, Tools, Experiments
Self-stabilizing device drivers
ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Object-oriented wrappers for the Linux kernel
Software—Practice & Experience
Techniques for service level enforcement in web-services based systems
Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
Recovery domains: an organizing principle for recoverable operating systems
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Transparent checkpoints of closed distributed systems in Emulab
Proceedings of the 4th ACM European conference on Computer systems
Self-recovery in server programs
Proceedings of the 2009 international symposium on Memory management
Service-level enforcement in web-services-based systems
International Journal of Web and Grid Services
Linux bugs: Life cycle, resolution and architectural analysis
Information and Software Technology
Why panic()?: improving reliability with restartable file systems
ACM SIGOPS Operating Systems Review
WYSINWYX: What you see is not what you eXecute
ACM Transactions on Programming Languages and Systems (TOPLAS)
Self-stabilizing device drivers
SSS'06 Proceedings of the 8th international conference on Stabilization, safety, and security of distributed systems
Analyzing stripped device-driver executables
TACAS'08/ETAPS'08 Proceedings of the Theory and practice of software, 14th international conference on Tools and algorithms for the construction and analysis of systems
RWset: attacking path explosion in constraint-based test generation
TACAS'08/ETAPS'08 Proceedings of the Theory and practice of software, 14th international conference on Tools and algorithms for the construction and analysis of systems
Membrane: Operating system support for restartable file systems
ACM Transactions on Storage (TOS)
Membrane: operating system support for restartable file systems
FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
CuriOS: improving reliability through operating system structure
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Tolerating file-system mistakes with EnvyFS
USENIX'09 Proceedings of the 2009 conference on USENIX Annual technical conference
Robusta: taming the native beast of the JVM
Proceedings of the 17th ACM conference on Computer and communications security
Trust and protection in the Illinois browser operating system
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Adapting software fault isolation to contemporary CPU architectures
USENIX Security'10 Proceedings of the 19th USENIX conference on Security
MultiSense: fine-grained multiplexing for steerable camera sensor networks
MMSys '11 Proceedings of the second annual ACM conference on Multimedia systems
Proceedings of the sixth conference on Computer systems
Combining control-flow integrity and static analysis for efficient and validated data sandboxing
Proceedings of the 18th ACM conference on Computer and communications security
Reorganizing UNIX for reliability
ACSAC'06 Proceedings of the 11th Asia-Pacific conference on Advances in Computer Systems Architecture
Linux kernel vulnerabilities: state-of-the-art defenses and open problems
Proceedings of the Second Asia-Pacific Workshop on Systems
Exception handling in the choices operating system
Advanced Topics in Exception Handling Techniques
Understanding modern device drivers
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Delusional boot: securing hypervisors without massive re-engineering
Proceedings of the 7th ACM european conference on Computer Systems
Is Linux kernel oops useful or not?
HotDep'12 Proceedings of the Eighth USENIX conference on Hot Topics in System Dependability
Assessing the trustworthiness of drivers
RAID'12 Proceedings of the 15th international conference on Research in Attacks, Intrusions, and Defenses
Fine-grained fault tolerance using device checkpoints
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Traveling forward in time to newer operating systems using ShadowReboot
Proceedings of the 9th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Hypnos: understanding and treating sleep conflicts in smartphones
Proceedings of the 8th ACM European Conference on Computer Systems
Arrakis: a case for the end of the empire
HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
Bringing java's wild native world under control
ACM Transactions on Information and System Security (TISSEC)
I/o paravirtualization at the device file boundary
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
HARDFS: hardening HDFS with selective and lightweight versioning
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
Hi-index | 0.00 |
This paper presents a new mechanism that enables applications to run correctly when device drivers fail. Because device drivers are the principal failing component in most systems, reducing driver-induced failures greatly improves overall reliability. Earlier work has shown that an operating system can survive driver failures [33], but the applications that depend on them cannot. Thus, while operating system reliability was greatly improved, application reliability generally was not. To remedy this situation, we introduce a new operating system mechanism called a shadow driver. A shadow driver monitors device drivers and transparently recovers from driver failures. Moreover, it assumes the role of the failed driver during recovery. In this way, applications using the failed driver, as well as the kernel itself, continue to function as expected. We implemented shadow drivers for the Linux operating system and tested them on over a dozen device drivers. Our results show that applications and the OS can indeed survive the failure of a variety of device drivers. Moreover, shadow drivers impose minimal performance overhead. Lastly, they can be introduced with only modest changes to the OS kernel and with no changes at all to existing device drivers.