On the optimum checkpoint selection problem
SIAM Journal on Computing
Optimistic recovery in distributed systems
ACM Transactions on Computer Systems (TOCS)
Checkpointing and Rollback-Recovery for Distributed Systems
IEEE Transactions on Software Engineering - Special issue on distributed systems
Computing Optimal Checkpointing Strategies for Rollback and Recovery Systems
IEEE Transactions on Computers - Fault-Tolerant Computing
Efficient distributed recovery using message logging
Proceedings of the eighth annual ACM Symposium on Principles of distributed computing
Distributed snapshots: determining global states of distributed systems
ACM Transactions on Computer Systems (TOCS)
Reliability Issues in Computing System Design
ACM Computing Surveys (CSUR)
Recovery Techniques for Database Systems
ACM Computing Surveys (CSUR)
Performance analysis of checkpointing strategies
ACM Transactions on Computer Systems (TOCS)
Concurrent Robust Checkpointing and Recovery in Distributed Systems
Proceedings of the Fourth International Conference on Data Engineering
A message system supporting fault tolerance
SOSP '83 Proceedings of the ninth ACM symposium on Operating systems principles
Event graph visualization for debugging large applications
SPDT '96 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Optimistic Crash Recovery without Changing Application Messages
IEEE Transactions on Parallel and Distributed Systems
On Coordinated Checkpointing in Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Mutable Checkpoints: A New Checkpointing Approach for Mobile Computing Systems
IEEE Transactions on Parallel and Distributed Systems
Complete Process Recovery: Using Vector Time to Handle Multiple Failures in Distributed Systems
IEEE Parallel & Distributed Technology: Systems & Technology
Checkpointing with mutable checkpoints
Theoretical Computer Science - Dependable computing
Asynchronous recovery without using vector timestamps
Journal of Parallel and Distributed Computing
Distributed Checkpointing on Clusters with Dynamic Striping and Staggering
ASIAN '02 Proceedings of the7th Asian Computing Science Conference on Advances in Computing Science: Internet Computing and Modeling, Grid Computing, Peer-to-Peer Computing, and Cluster
An Efficient Optimistic Message Logging Scheme for Recoverable Mobile Computing Systems
IEEE Transactions on Mobile Computing
Garbage collection in message passing distributed systems
PAS '95 Proceedings of the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis
Deadlocks in fully uncoordinated checkpointing rollback recovery systems
WORDS '97 Proceedings of the 3rd Workshop on Object-Oriented Real-Time Dependable Systems - (WORDS '97)
Checkpoint and Rollback in Asynchronous Distributed Systems
INFOCOM '97 Proceedings of the INFOCOM '97. Sixteenth Annual Joint Conference of the IEEE Computer and Communications Societies. Driving the Information Revolution
Concurrent checkpoint initiation and recovery algorithms on asynchronous ring networks
Journal of Parallel and Distributed Computing
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 1 - Volume 02
A novel min-process checkpointing scheme for mobile computing systems
Journal of Systems Architecture: the EUROMICRO Journal
Performance analysis of different checkpointing and recovery schemes using stochastic model
Journal of Parallel and Distributed Computing
Starc: static analysis for efficient repair of complex data
Proceedings of the 22nd annual ACM SIGPLAN conference on Object-oriented programming systems and applications
A synchronous checkpointing protocol for mobile distributed systems: probabilistic approach
International Journal of Information and Computer Security
A low-cost hybrid coordinated checkpointing protocol for mobile distributed systems
Mobile Information Systems
Efficient solving of structural constraints
ISSTA '08 Proceedings of the 2008 international symposium on Software testing and analysis
WSEAS Transactions on Computers
Journal of Parallel and Distributed Computing
Agent based dynamic recovery protocol in distributed databases
ISPDC'03 Proceedings of the Second international conference on Parallel and distributed computing
An efficient computing-checkpoint based coordinated checkpoint algorithm
EUC'06 Proceedings of the 2006 international conference on Embedded and Ubiquitous Computing
Using computing checkpoints implement consistent low-cost non-blocking coordinated checkpointing
PDCAT'04 Proceedings of the 5th international conference on Parallel and Distributed Computing: applications and Technologies
Efficiently generating structurally complex inputs with thousands of objects
ECOOP'07 Proceedings of the 21st European conference on Object-Oriented Programming
HotSnap: a hot distributed snapshot system for virtual machine cluster
LISA'13 Proceedings of the 27th international conference on Large Installation System Administration
Hi-index | 0.00 |
Given the increased availability of general purpose parallel computers two issues arise:One needs to compare the performance of the different available platforms using realisticexamples, and it is necessary to write application software that can be ported easily inorder to take advantage of different platforms. The authors address these issues from anapplications point of view. They are interested in the use of general purpose parallelcomputers for simulation tasks needed during the design of very large scale integrated(VLSI) circuits. They characterize the simulation task as a useful benchmark andintroduce a high level process view of parallel simulation that is helpful for derivingportable parallel programs. Details of the partitioning strategy and the simulation algorithm used in the application are given. They discuss their implementation on different parallel machines and give statistics of various experiments.