Understanding fault-tolerant distributed systems
Communications of the ACM
Distributed systems (2nd Ed.)
Providing reliable and fault tolerant broadcast delivery in mobile ad-hoc networks
Mobile Networks and Applications
GPSR: greedy perimeter stateless routing for wireless networks
MobiCom '00 Proceedings of the 6th annual international conference on Mobile computing and networking
Mutable Checkpoints: A New Checkpointing Approach for Mobile Computing Systems
IEEE Transactions on Parallel and Distributed Systems
Checkpointing distributed applications on mobile computers
PDIS '94 Proceedings of the third international conference on on Parallel and distributed information systems
PSFQ: a reliable transport protocol for wireless sensor networks
WSNA '02 Proceedings of the 1st ACM international workshop on Wireless sensor networks and applications
An Adaptive Protocol for Reliable Multicast in Mobile Multi-hop Radio Networks
WMCSA '99 Proceedings of the Second IEEE Workshop on Mobile Computer Systems and Applications
Ad-hoc On-Demand Distance Vector Routing
WMCSA '99 Proceedings of the Second IEEE Workshop on Mobile Computer Systems and Applications
HYDRANET-FT: Network Support for Dependable Services
ICDCS '00 Proceedings of the The 20th International Conference on Distributed Computing Systems ( ICDCS 2000)
Migratory TCP: Connection Migration for Service Continuity in the Internet
ICDCS '02 Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02)
A Reliable Multicast Algorithm for Mobile Ad Hoc Networks
ICDCS '02 Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02)
MP-DSR: A QoS-Aware Multi-Path Dynamic Source Routing Protocol for Wireless Ad-Hoc Networks
LCN '01 Proceedings of the 26th Annual IEEE Conference on Local Computer Networks
Anonymous Gossip: Improving Multicast Reliability in Mobile Ad-Hoc Networks
ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
Pilot: Probabilistic Lightweight Group Communication System for Ad Hoc Networks
IEEE Transactions on Mobile Computing
System support for pervasive applications
ACM Transactions on Computer Systems (TOCS)
Context Aware Session Management for Services in Ad Hoc Networks
SCC '05 Proceedings of the 2005 IEEE International Conference on Services Computing - Volume 01
Epidemic-based approaches for reliable multicast in mobile ad hoc networks
ACM SIGOPS Operating Systems Review
Trickles: a stateless network stack for improved scalability, resilience, and flexibility
NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2
Context-Aware Migratory Services in Ad Hoc Networks
IEEE Transactions on Mobile Computing
Hi-index | 0.00 |
Mobile ad hoc networks can be leveraged to provide ubiquitous services capable of acquiring, processing, and sharing real-time information from the physical world. Unlike Internet services, these services have to survive frequent and unpredictable faults such as disconnections, crashes, or users turning off their devices. This paper describes a context-aware fault tolerance mechanism for our migratory services model. In this model, a per-client service instance transparently migrates to different nodes in the network to provide a continuous and semantically-correct interaction with its client. The proposed fault tolerance mechanism extends the primary-backup approach with a context-aware checkpointing process. The backup node is dynamically selected based on its distance from the client and service, the similarity of its mobility pattern with those of the client and service, the frequency of the checkpointing process, and the size of the checkpointing state. We demonstrate the feasibility of our approach through a prototype implementation tested in a small scale ad hoc network of smart phones. Additionally, we simulate our mechanism in a realistic urban environment with 300 pedestrians, cyclists, and cars. Compared to approaches where the backup node is a neighbor of the service node or the client node itself, our mechanism performs as much as 80% better than the former for recovery ratio, and three times better than the latter for network overhead, while achieving better or similar recovery latency.