Optimistic recovery in distributed systems
ACM Transactions on Computer Systems (TOCS)
Checkpointing and Rollback-Recovery for Distributed Systems
IEEE Transactions on Software Engineering - Special issue on distributed systems
Information Processing Letters
Efficient distributed recovery using message logging
Proceedings of the eighth annual ACM Symposium on Principles of distributed computing
A message-optimal algorithm for distributed termination detection
Journal of Parallel and Distributed Computing
A network architecture providing host migration transparency
SIGCOMM '91 Proceedings of the conference on Communications architecture & protocols
Concurrent online tracking of mobile users
SIGCOMM '91 Proceedings of the conference on Communications architecture & protocols
IP-based protocols for mobile internetworking
SIGCOMM '91 Proceedings of the conference on Communications architecture & protocols
Manetho: Transparent Roll Back-Recovery with Low Overhead, Limited Rollback, and Fast Output Commit
IEEE Transactions on Computers - Special issue on fault-tolerant computing
Distributed snapshots: determining global states of distributed systems
ACM Transactions on Computer Systems (TOCS)
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
The Challenges of Mobile Computing
Computer
Concurrent Robust Checkpointing and Recovery in Distributed Systems
Proceedings of the Fourth International Conference on Data Engineering
On Coordinated Checkpointing in Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Mutable Checkpoints: A New Checkpointing Approach for Mobile Computing Systems
IEEE Transactions on Parallel and Distributed Systems
Termination Detection Protocols for Mobile Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Mobile ad hoc networks and routing protocols
Handbook of wireless networks and mobile computing
A survey of rollback-recovery protocols in message-passing systems
ACM Computing Surveys (CSUR)
Checkpointing with mutable checkpoints
Theoretical Computer Science - Dependable computing
Asynchronous recovery without using vector timestamps
Journal of Parallel and Distributed Computing
Distributed Checkpointing on Clusters with Dynamic Striping and Staggering
ASIAN '02 Proceedings of the7th Asian Computing Science Conference on Advances in Computing Science: Internet Computing and Modeling, Grid Computing, Peer-to-Peer Computing, and Cluster
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Checkpoint-Recovery for Mobile Intelligent Networks
Proceedings of the 14th International conference on Industrial and engineering applications of artificial intelligence and expert systems: engineering of intelligent systems
A Checkpointing Tool for Palm Operating System
DSN '01 Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS)
An Efficient Optimistic Message Logging Scheme for Recoverable Mobile Computing Systems
IEEE Transactions on Mobile Computing
Selective Checkpointing and Rollbacks in Multithreaded Distributed Systems
ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
Fault management in mobile computing
Ubiquity
An efficient time-based checkpointing protocol for mobile computing systems over mobile IP
Mobile Networks and Applications - Mobile networking through IP
Concurrent checkpoint initiation and recovery algorithms on asynchronous ring networks
Journal of Parallel and Distributed Computing
Collaborative backup for dependable mobile applications
MPAC '04 Proceedings of the 2nd workshop on Middleware for pervasive and ad-hoc computing
A novel min-process checkpointing scheme for mobile computing systems
Journal of Systems Architecture: the EUROMICRO Journal
Performance analysis of different checkpointing and recovery schemes using stochastic model
Journal of Parallel and Distributed Computing
Inner-Circle Consistency for Wireless Ad Hoc Networks
IEEE Transactions on Mobile Computing
Design and analysis of a fault tolerant hybrid mobile scheme
Information Sciences: an International Journal
Deploying and managing Web services: issues, solutions, and directions
The VLDB Journal — The International Journal on Very Large Data Bases
A synchronous checkpointing protocol for mobile distributed systems: probabilistic approach
International Journal of Information and Computer Security
A low-cost hybrid coordinated checkpointing protocol for mobile distributed systems
Mobile Information Systems
Fault recovery mechanism in single-hop sensor networks
Computer Communications
A weighted checkpointing protocol for mobile distributed systems
International Journal of Ad Hoc and Ubiquitous Computing
A consistent checkpointing-recovery protocol for minimal number of nodes in mobile computing system
HiPC'07 Proceedings of the 14th international conference on High performance computing
Introducing mobile devices into Grid systems: a survey
International Journal of Web and Grid Services
New & efficient low overheads algorithm for mobile distributed systems
Proceedings of the International Conference & Workshop on Emerging Trends in Technology
New & efficient low overheads algorithm for mobile distributed systems
Proceedings of the International Conference & Workshop on Emerging Trends in Technology
An efficient and scalable checkpointing and recovery algorithm for distributed systems
ICDCN'06 Proceedings of the 8th international conference on Distributed Computing and Networking
An efficient computing-checkpoint based coordinated checkpoint algorithm
EUC'06 Proceedings of the 2006 international conference on Embedded and Ubiquitous Computing
Using computing checkpoints implement consistent low-cost non-blocking coordinated checkpointing
PDCAT'04 Proceedings of the 5th international conference on Parallel and Distributed Computing: applications and Technologies
Implementing rollback-recovery coordinated checkpoints
ISSADS'05 Proceedings of the 5th international conference on Advanced Distributed Systems
Mobile agent based fault-tolerance support for the reliable mobile computing systems
COORDINATION'05 Proceedings of the 7th international conference on Coordination Models and Languages
Soft-Checkpointing Based Hybrid Synchronous Checkpointing Protocol for Mobile Distributed Systems
International Journal of Distributed Systems and Technologies
Hi-index | 0.00 |
A mobile computing system consists of mobile and stationary nodes, connected to each other by a communication network. The presence of mobile nodes in the system places constraints on the permissible energy consumption and available communication bandwidth. To minimize the lost computation during recovery from node failures, periodic collection of a consistent snapshot of the system (checkpoint) is required. Locating mobile nodes contributes to the checkpointing and recovery costs. Synchronous snapshot collection algorithms, designed for static networks, either force every node in the system to take a new local snapshot, or block the underlying computation during snapshot collection. Hence, they are not suitable for mobile computing systems. If nodes take their local checkpoints independently in an uncoordinated manner, each node may have to store multiple local checkpoints in stable storage. This is not suitable for mobile nodes as they have small memory. This paper presents a synchronous snapshot collection algorithm for mobile systems that neither forces every node to take a local snapshot, nor blocks the underlying computation during snapshot collection. If a node initiates snapshot collection, local snapshots of only those nodes that have directly or transitively affected the initiator since their last snapshots need to be taken. We prove that the global snapshot collection terminates within a finite time of its invocation and the collected global snapshot is consistent. We also propose a minimal rollback/recovery algorithm in which the computation at a node is rolled back only if it depends on operations that have been undone due to the failure of node(s). Both the algorithms have low communication and storage overheads and meet the low energy consumption and low bandwidth constraints of mobile computing systems.