Recovering Internet Service Sessions from Operating System Failures

Authors:
Florin Sultan;Aniruddha Bohra;Stephen Smaldone;Yufei Pan;Pascal Gallard;Iulian Neamtiu;Liviu Iftode
Affiliations:
Rutgers University;Rutgers University;Rutgers University;Rutgers University;IRISA/INRIA;University of Maryland, College Park;Rutgers University
Venue:
IEEE Internet Computing
Year:
2005

Citing 9
Cited 5

Hypervisor-based fault tolerance

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
The Rio file cache: surviving operating system crashes

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Monitoring shared virtual memory performance on a Myrinet-based PC cluster

ICS '98 Proceedings of the 12th international conference on Supercomputing
Fast cluster failover using virtual memory-mapped communication

ICS '99 Proceedings of the 13th international conference on Supercomputing
Performance and scalability of EJB applications

OOPSLA '02 Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Improving the reliability of commodity operating systems

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Remote Repair of Operating System State Using Backdoors

ICAC '04 Proceedings of the First International Conference on Autonomic Computing
Microreboot — A technique for cheap recovery

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Fine-grained failover using connection migration

USITS'01 Proceedings of the 3rd conference on USENIX Symposium on Internet Technologies and Systems - Volume 3

Towards highly available and scalable high performance clusters

Journal of Computer and System Sciences
Exploring recovery from operating system lockups

ATC'07 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference
Multiprimary Support for the Availability of Cluster-Based Stateful Firewalls Using FT-FW

ESORICS '08 Proceedings of the 13th European Symposium on Research in Computer Security: Computer Security
Practical and low-overhead masking of failures of TCP-based servers

ACM Transactions on Computer Systems (TOCS)
Towards building a highly-available cluster based model for high performance computing

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Current Internet service architectures lack support for salvaging stateful client sessions when the underlying operating system fails due to hangs, crashes, deadlocks, or panics. The Backdoors (BD) system is designed to detect such failures and recover service sessions in clusters of Internet servers by extracting lightweight state associated with client service sessions from server memory. The BD architecture combines hardware and software mechanisms to enable accurate monitoring and remote healing actions, even in the presence of failures that render a system unavailable.