Context-aware fault tolerance in migratory services

Authors:
Oriana Riva;Josiane Nzouonta;Cristian Borcea
Affiliations:
ETH Zürich, Zürich, Switzerland;New Jersey Institute of Technology, Newark, NJ;New Jersey Institute of Technology, Newark, NJ
Venue:
Proceedings of the 5th Annual International Conference on Mobile and Ubiquitous Systems: Computing, Networking, and Services
Year:
2008

Citing 20
Cited 0

Understanding fault-tolerant distributed systems

Communications of the ACM
The primary-backup approach

Distributed systems (2nd Ed.)
Providing reliable and fault tolerant broadcast delivery in mobile ad-hoc networks

Mobile Networks and Applications
GPSR: greedy perimeter stateless routing for wireless networks

MobiCom '00 Proceedings of the 6th annual international conference on Mobile computing and networking
Mutable Checkpoints: A New Checkpointing Approach for Mobile Computing Systems

IEEE Transactions on Parallel and Distributed Systems
Checkpointing distributed applications on mobile computers

PDIS '94 Proceedings of the third international conference on on Parallel and distributed information systems
PSFQ: a reliable transport protocol for wireless sensor networks

WSNA '02 Proceedings of the 1st ACM international workshop on Wireless sensor networks and applications
An Adaptive Protocol for Reliable Multicast in Mobile Multi-hop Radio Networks

WMCSA '99 Proceedings of the Second IEEE Workshop on Mobile Computer Systems and Applications
Ad-hoc On-Demand Distance Vector Routing

WMCSA '99 Proceedings of the Second IEEE Workshop on Mobile Computer Systems and Applications
HYDRANET-FT: Network Support for Dependable Services

ICDCS '00 Proceedings of the The 20th International Conference on Distributed Computing Systems ( ICDCS 2000)
Migratory TCP: Connection Migration for Service Continuity in the Internet

ICDCS '02 Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02)
A Reliable Multicast Algorithm for Mobile Ad Hoc Networks

ICDCS '02 Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02)
MP-DSR: A QoS-Aware Multi-Path Dynamic Source Routing Protocol for Wireless Ad-Hoc Networks

LCN '01 Proceedings of the 26th Annual IEEE Conference on Local Computer Networks
Anonymous Gossip: Improving Multicast Reliability in Mobile Ad-Hoc Networks

ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
Pilot: Probabilistic Lightweight Group Communication System for Ad Hoc Networks

IEEE Transactions on Mobile Computing
System support for pervasive applications

ACM Transactions on Computer Systems (TOCS)
Context Aware Session Management for Services in Ad Hoc Networks

SCC '05 Proceedings of the 2005 IEEE International Conference on Services Computing - Volume 01
Epidemic-based approaches for reliable multicast in mobile ad hoc networks

ACM SIGOPS Operating Systems Review
Trickles: a stateless network stack for improved scalability, resilience, and flexibility

NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2
Context-Aware Migratory Services in Ad Hoc Networks

IEEE Transactions on Mobile Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mobile ad hoc networks can be leveraged to provide ubiquitous services capable of acquiring, processing, and sharing real-time information from the physical world. Unlike Internet services, these services have to survive frequent and unpredictable faults such as disconnections, crashes, or users turning off their devices. This paper describes a context-aware fault tolerance mechanism for our migratory services model. In this model, a per-client service instance transparently migrates to different nodes in the network to provide a continuous and semantically-correct interaction with its client. The proposed fault tolerance mechanism extends the primary-backup approach with a context-aware checkpointing process. The backup node is dynamically selected based on its distance from the client and service, the similarity of its mobility pattern with those of the client and service, the frequency of the checkpointing process, and the size of the checkpointing state. We demonstrate the feasibility of our approach through a prototype implementation tested in a small scale ad hoc network of smart phones. Additionally, we simulate our mechanism in a realistic urban environment with 300 pedestrians, cyclists, and cars. Compared to approaches where the backup node is a neighbor of the service node or the client node itself, our mechanism performs as much as 80% better than the former for recovery ratio, and three times better than the latter for network overhead, while achieving better or similar recovery latency.