Optimistic recovery in distributed systems
ACM Transactions on Computer Systems (TOCS)
Distributed snapshots: determining global states of distributed systems
ACM Transactions on Computer Systems (TOCS)
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
A message system supporting fault tolerance
SOSP '83 Proceedings of the ninth ACM symposium on Operating systems principles
Publishing: a reliable broadcast communication mechanism
SOSP '83 Proceedings of the ninth ACM symposium on Operating systems principles
Preserving and using context information in interprocess communication
ACM Transactions on Computer Systems (TOCS)
Efficient distributed recovery using message logging
Proceedings of the eighth annual ACM Symposium on Principles of distributed computing
Transparent optimistic rollback recovery
ACM SIGOPS Operating Systems Review
About logical clocks for distributed systems
ACM SIGOPS Operating Systems Review
Adaptive message logging for incremental replay of message-passing programs
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
On the relevance of communication costs of rollback-recovery protocols
Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
Fast cluster failover using virtual memory-mapped communication
ICS '99 Proceedings of the 13th international conference on Supercomputing
Transparent optimistic rollback recovery
EW 4 Proceedings of the 4th workshop on ACM SIGOPS European workshop
Adaptive Message Logging for Incremental Program Replay
IEEE Parallel & Distributed Technology: Systems & Technology
Recovering from Multiple Process Failures in the Time Warp Mechanism
IEEE Transactions on Computers
A Service Acquisition Mechanism for Server-Based Heterogeneous Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Software Engineering
Transparent Fault Tolerance for Web Services Based Architectures
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
High-Level Synthesis of Recoverable Microarchitectures
EDTC '96 Proceedings of the 1996 European conference on Design and Test
Garbage collection in message passing distributed systems
PAS '95 Proceedings of the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis
Optimal Recovery Point Insertion for High-Level Synthesis of Recoverable Microarchitectures
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
On Slicing a Distributed Computation
ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
Multiversioning and Logging in the Grasshopper Kernel Persistent Store
IWOOOS '95 Proceedings of the 4th International Workshop on Object-Orientation in Operating Systems
A service acquisition mechanism for the client/service model in cygnus
CASCON '91 Proceedings of the 1991 conference of the Centre for Advanced Studies on Collaborative research
Rx: treating bugs as allergies---a safe method to survive software failures
Proceedings of the twentieth ACM symposium on Operating systems principles
Techniques and applications of computation slicing
Distributed Computing
ML grid programming with ConCert
Proceedings of the 2006 workshop on ML
Quasi-atomic recovery for distributed agents
Parallel Computing
Flashback: a lightweight extension for rollback and deterministic replay for software debugging
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Exploring failure transparency and the limits of generic recovery
OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Transparent fault tolerance for parallel applications on networks of workstations
ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference
Towards an Autonomic Element Architecture for ASSL
SEAMS '07 Proceedings of the 2007 International Workshop on Software Engineering for Adaptive and Self-Managing Systems
Rx: Treating bugs as allergies—a safe method to survive software failures
ACM Transactions on Computer Systems (TOCS)
Evaluating the viability of process replication reliability for exascale systems
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Evaluating operating system vulnerability to memory errors
Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers
Alleviating scalability issues of checkpointing protocols
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
The viability of using compression to decrease message log sizes
Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
Evaluating the feasibility of using memory content similarity to improve system resilience
Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers
Evaluating energy savings for checkpoint/restart
E2SC '13 Proceedings of the 1st International Workshop on Energy Efficient Supercomputing
Hi-index | 0.00 |