Hypervisor-based fault tolerance
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
The Rio file cache: surviving operating system crashes
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Monitoring shared virtual memory performance on a Myrinet-based PC cluster
ICS '98 Proceedings of the 12th international conference on Supercomputing
Fast cluster failover using virtual memory-mapped communication
ICS '99 Proceedings of the 13th international conference on Supercomputing
Performance and scalability of EJB applications
OOPSLA '02 Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Improving the reliability of commodity operating systems
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Remote Repair of Operating System State Using Backdoors
ICAC '04 Proceedings of the First International Conference on Autonomic Computing
Microreboot — A technique for cheap recovery
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Fine-grained failover using connection migration
USITS'01 Proceedings of the 3rd conference on USENIX Symposium on Internet Technologies and Systems - Volume 3
Towards highly available and scalable high performance clusters
Journal of Computer and System Sciences
Exploring recovery from operating system lockups
ATC'07 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference
Multiprimary Support for the Availability of Cluster-Based Stateful Firewalls Using FT-FW
ESORICS '08 Proceedings of the 13th European Symposium on Research in Computer Security: Computer Security
Practical and low-overhead masking of failures of TCP-based servers
ACM Transactions on Computer Systems (TOCS)
Towards building a highly-available cluster based model for high performance computing
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Hi-index | 0.00 |
Current Internet service architectures lack support for salvaging stateful client sessions when the underlying operating system fails due to hangs, crashes, deadlocks, or panics. The Backdoors (BD) system is designed to detect such failures and recover service sessions in clusters of Internet servers by extracting lightweight state associated with client service sessions from server memory. The BD architecture combines hardware and software mechanisms to enable accurate monitoring and remote healing actions, even in the presence of failures that render a system unavailable.