A low-cost hybrid coordinated checkpointing protocol for mobile distributed systems

Authors:
Parveen Kumar
Affiliations:
Department of Computer Sc & Engineering, Asia Pacific Institute of Information Technology, Panipal (Haryana), India. Tel.: +91 0180 2620043/ E-mail: pk223475@yahoo.com
Venue:
Mobile Information Systems
Year:
2008

Citing 19
Cited 5

Checkpointing and Rollback-Recovery for Distributed Systems

IEEE Transactions on Software Engineering - Special issue on distributed systems
Distributed snapshots: determining global states of distributed systems

ACM Transactions on Computer Systems (TOCS)
Low-Cost Checkpointing and Failure Recovery in Mobile Computing Systems

IEEE Transactions on Parallel and Distributed Systems
Adaptive recovery for mobile environments

Communications of the ACM
On Coordinated Checkpointing in Distributed Systems

IEEE Transactions on Parallel and Distributed Systems
Time, clocks, and the ordering of events in a distributed system

Communications of the ACM
Mutable Checkpoints: A New Checkpointing Approach for Mobile Computing Systems

IEEE Transactions on Parallel and Distributed Systems
Checkpointing distributed applications on mobile computers

PDIS '94 Proceedings of the third international conference on on Parallel and distributed information systems
A survey of rollback-recovery protocols in message-passing systems

ACM Computing Surveys (CSUR)
An Efficient Protocol for Checkpointing Recovery in Distributed Systems

IEEE Transactions on Parallel and Distributed Systems
On the Impossibility of Min-Process Non-Blocking Checkpointing and An Efficient Checkpointing Algorithm for Mobile Computing Systems

ICPP '98 Proceedings of the 1998 International Conference on Parallel Processing
Recoverable mobile environment: design and trade-off analysis

FTCS '96 Proceedings of the The Twenty-Sixth Annual International Symposium on Fault-Tolerant Computing (FTCS '96)
A Communication-Induced Checkpointing Protocol that Ensures Rollback-Dependency Trackability

FTCS '97 Proceedings of the 27th International Symposium on Fault-Tolerant Computing (FTCS '97)
Communication-Induced Determination of Consistent Snapshots

FTCS '98 Proceedings of the The Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing
A survey of software infrastructures and frameworks for ubiquitous computing

Mobile Information Systems
Data retrieval for location-dependent queries in a multi-cell wireless environment

Mobile Information Systems
A synchronous checkpointing protocol for mobile distributed systems: probabilistic approach

International Journal of Information and Computer Security
Alternative data storage solution for mobile messaging services

Mobile Information Systems
Cooperative caching in mobile ad hoc networks based on data utility

Mobile Information Systems

Managing data using neighbour replication on a triangular-grid structure

International Journal of High Performance Computing and Networking
Service-level enforcement in web-services-based systems

International Journal of Web and Grid Services
New & efficient low overheads algorithm for mobile distributed systems

Proceedings of the International Conference & Workshop on Emerging Trends in Technology
New & efficient low overheads algorithm for mobile distributed systems

Proceedings of the International Conference & Workshop on Emerging Trends in Technology
Soft-Checkpointing Based Hybrid Synchronous Checkpointing Protocol for Mobile Distributed Systems

International Journal of Distributed Systems and Technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mobile distributed systems raise new issues such as mobility, low bandwidth of wireless channels, disconnections, limited battery power and lack of reliable stable storage on mobile nodes. In minimum-process coordinated checkpointing, some processes may not checkpoint for several checkpoint initiations. In the case of a recovery after a fault, such processes may rollback to far earlier checkpointed state and thus may cause greater loss of computation. In all-process coordinated checkpointing, the recovery line is advanced for all processes but the checkpointing overhead may be exceedingly high. To optimize both matrices, the checkpointing overhead and the loss of computation on recovery, we propose a hybrid checkpointing algorithm, wherein an all-process coordinated checkpoint is taken after the execution of minimum-process coordinated checkpointing algorithm for a fixed number of times. Thus, the Mobile nodes with low activity or in doze mode operation may not be disturbed in the case of minimum-process checkpointing and the recovery line is advanced for each process after an all-process checkpoint. Additionally, we try to minimize the information piggybacked onto each computation message. For minimum-process checkpointing, we design a blocking algorithm, where no useless checkpoints are taken and an effort has been made to optimize the blocking of processes. We propose to delay selective messages at the receiver end. By doing so, processes are allowed to perform their normal computation, send messages and partially receive them during their blocking period. The proposed minimum-process blocking algorithm forces zero useless checkpoints at the cost of very small blocking.