ACM Transactions on Programming Languages and Systems (TOPLAS)
Checkpointing and Rollback-Recovery for Distributed Systems
IEEE Transactions on Software Engineering - Special issue on distributed systems
Debugging distributed C programs by real time reply
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Adaptive message logging for incremental replay of message-passing programs
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
About state recording in asynchronous computations
PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
Consistent Global Checkpoints that Contain a Given Set of Local Checkpoints
IEEE Transactions on Computers
IEEE Transactions on Parallel and Distributed Systems
On Coordinated Checkpointing in Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
An Index-Based Checkpointing Algorithm for Autonomous Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Rollback-dependency trackability: visible characterizations
Proceedings of the eighteenth annual ACM symposium on Principles of distributed computing
Ajents: towards an environment for parallel, distributed and mobile Java applications
JAVA '99 Proceedings of the ACM 1999 conference on Java Grande
Quasi-Synchronous Checkpointing: Models, Characterization, and Classification
IEEE Transactions on Parallel and Distributed Systems
Communication-Induced Determination of Consistent Snapshots
IEEE Transactions on Parallel and Distributed Systems
Mutable Checkpoints: A New Checkpointing Approach for Mobile Computing Systems
IEEE Transactions on Parallel and Distributed Systems
A survey of rollback-recovery protocols in message-passing systems
ACM Computing Surveys (CSUR)
On-the-fly calculation and verification of consistent steering transactions
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Finding Consistent Global Checkpoints in a Distributed Computation
IEEE Transactions on Parallel and Distributed Systems
Consistency Issues in Distributed Checkpoints
IEEE Transactions on Software Engineering
Checkpointing with mutable checkpoints
Theoretical Computer Science - Dependable computing
Asynchronous recovery without using vector timestamps
Journal of Parallel and Distributed Computing
Interval consistency of asynchronous distributed computations
Journal of Computer and System Sciences
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Computation Slicing: Techniques and Theory
DISC '01 Proceedings of the 15th International Conference on Distributed Computing
Guaranteed Mutually Consistent Checkpointing in Distributed Computations
ASIAN '98 Proceedings of the 4th Asian Computing Science Conference on Advances in Computing Science
Distributed Checkpointing on Clusters with Dynamic Striping and Staggering
ASIAN '02 Proceedings of the7th Asian Computing Science Conference on Advances in Computing Science: Internet Computing and Modeling, Grid Computing, Peer-to-Peer Computing, and Cluster
Distributed Database Checkpointing
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Universal Constructs in Distributed Computations
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Components for State Restoration in Tree Search
CP '01 Proceedings of the 7th International Conference on Principles and Practice of Constraint Programming
An Adaptive Checkpointing Protocol to Bound Recovery Time with Message Logging
SRDS '99 Proceedings of the 18th IEEE Symposium on Reliable Distributed Systems
Evaluating Distributed Checkpointing Protocol
ICDCS '03 Proceedings of the 23rd International Conference on Distributed Computing Systems
On the Minimal Characterization of the Rollback-Dependency Trackability Property
ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
On Properties of RDT Communication-Induced Checkpointing Protocols
IEEE Transactions on Parallel and Distributed Systems
Finding a Recovery Line in Uncoordinated Checkpointing
ICDCSW '04 Proceedings of the 24th International Conference on Distributed Computing Systems Workshops - W7: EC (ICDCSW'04) - Volume 7
Quantifying rollback propagation in distributed checkpointing
Journal of Parallel and Distributed Computing
Communication-based prevention of useless checkpoints in distributed computations
Distributed Computing
Using Consistent Global Checkpoints to Synchronize Processes in Distributed Simulation
DS-RT '05 Proceedings of the 9th IEEE International Symposium on Distributed Simulation and Real-Time Applications
An Efficient Index-Based Checkpointing Protocol with Constant-Size Control Information on Messages
IEEE Transactions on Dependable and Secure Computing
Techniques and applications of computation slicing
Distributed Computing
Declarative failure recovery for sensor networks
Proceedings of the 6th international conference on Aspect-oriented software development
On the Complexity of Removing Z-Cycles from a Checkpoints and Communication Pattern
IEEE Transactions on Computers
An enhanced model-based checkpointing protocol
PDCN'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: parallel and distributed computing and networks
Model-based performance evaluation of distributed checkpointing protocols
Performance Evaluation
A quasi-synchronous checkpointing algorithm that prevents contention for stable storage
Information Sciences: an International Journal
A quasi-synchronous checkpointing algorithm that prevents contention for stable storage
Information Sciences: an International Journal
Journal of Parallel and Distributed Computing
Journal of Parallel and Distributed Computing
ICS'08 Proceedings of the 12th WSEAS international conference on Systems
Information Sciences: an International Journal
An uncoordinated asynchronous checkpointing model for hierarchical scientific workflows
Journal of Computer and System Sciences
Causal cycle based communication pattern matching
ICDCN'10 Proceedings of the 11th international conference on Distributed computing and networking
libhashckpt: hash-based incremental checkpointing using GPU's
EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
An asynchronous recovery algorithm based on a staggered quasi-synchronous checkpointing algorithm
IWDC'05 Proceedings of the 7th international conference on Distributed Computing
Using computing checkpoints implement consistent low-cost non-blocking coordinated checkpointing
PDCAT'04 Proceedings of the 5th international conference on Parallel and Distributed Computing: applications and Technologies
Plausible clocks with bounded inaccuracy
DISC'05 Proceedings of the 19th international conference on Distributed Computing
From the Happened-Before Relation to the Causal Ordered Set Abstraction
Journal of Parallel and Distributed Computing
Future Generation Computer Systems
A multi-cycle checkpointing protocol that ensures strict 1-rollback
Information Processing Letters
Hi-index | 0.01 |
Consistent global snapshots are important in many distributed applications. We prove the exact conditions for an arbitrary checkpoint, or a set of checkpoints, to belong to a consistent global snapshot, a previously open problem. To describe the conditions, we introduce a generalization of Lamport's happened-before relation called a zigzag path.Index Terms驴Causality, global checkpoints, distributed systems, consistent global states, Lamport's happened-before relation.